NOTICE: The Processors Wiki will End-of-Life in December of 2020. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.
Keystone Error Detection and Correction EDC ECC
- 1 Keystone Error Detection and Correction - EDC and Error Correcting Codes - ECC
Keystone Error Detection and Correction - EDC and Error Correcting Codes - ECC
Many applications have very stringent requirements to detect faults in the memory system of a processor to avoid failures of the end-system that could lead to dangerous situations for the end-user, or high requirements regarding the availability of the end-system. There are many mechanisms that could lead to faults in the memory of a processor. Some of these could lead to permanent faults and others to transient faults.
Detection of transient faults while the system is running is very important for critical applications. While permanent faults can be equally important, the likelihood of them occurring is usually significantly lower than transient faults. Permanent faults can, in many cases, be detected by running appropriate test algorithms at application startup or shutdown. Transient faults are faults predominantly introduced by soft errors. Major contributors of these are alpha radiation of the materials used in the package of a chip or neutron particles from cosmic rays. These can lead to bit flips in memories or changes to the state of a flip-flop.
There are multiple mechanisms in the Keystone Architecture that is provided to detect such faults and in specific instances provide the ability to correct some of these faults.
Keystone Error Detection and Correct - EDC
C66x L1P - EDC Implementation
L1P Error Detection Logic can detect single bit error for accesses that hit within L1P RAM or L1P cache. While the Error Detect logic is enabled, all 64-bit DMA writes will update and store parity and valid bits. Writes narrower than 64 bits (or) non-aligned writes will update the parity RAM to indicate ‘invalid parity.’ L1P checks parity for each program fetch on L1P as all the program fetches are 256-bit aligned. In the case of DMA/IDMA read access to L1P memory, the parity check occurs only when the data size is at least 64-bit wide or a multiple of 64-bit wide.
For Full Details on EDC Implementation in L1P on Keystone Devices, please see the C66x CorePac UG
No error detection or correction is implemented in L1D SRAM/Cache. The L1D is normally all cache, and the memory is usually temporary and in the rare instance of a bit flip that may occur it typically would not result in a system crash.
C66x L2 - EDC Implmentation
The L2 memory controller provides EDC with a hamming code capable of detecting double-bit errors and correcting single-bit errors within each 128-bit word. EDC is supported for both L2 RAM and L2 cache accesses. All 128-bit writes to L2 memory update the stored parity and valid bits in L2 RAM regardless of whether EDC logic is enabled or disabled. The L2 memory controller always performs a full hamming code check on 128-bit reads of L2 regardless of whether the fetch is from L1P, L1D, IDMA, or DMA. Writing narrower than 128 bits updates the parity RAM in L2 to indicate invalid parity and zeroes the parity values regardless of whether EDC is enabled or disabled. All 128-bit reads will be parity-checked when the EDC logic is enabled. L2 memory controller also applies EDC to L2 victims. Error Detection is performed on all L2 data fetches by L1D cache without any correction.
For Full Details on EDC Implementation in L1P on Keystone Devices, please see the C66x CorePac UG.
Keystone MSMC RAM - EDC Implmentation
The MSMC has error detection and correction hardware to protect the contents of the MSMC memory storage against corruption due to transient (soft) errors. The level of protection provided and the scheme used is the same as that of the C66x CorePacs (that is, one-bit error correction, two-bit error detection, with the parity codes calculated over a 256 bit datum).
The MSMC EDC HW also provides a scrubbing engine which periodically cycles through each location of each memory bank in the MSMC, reading and correcting the data, recalculating the parity bits for the data, and storing the data and parity information. Each such “scrubbing cycle” consists of a series of read-modify-write “scrub bursts” to the memory banks.
Keystone DDR3 Error Correcting Code - ECC
For data integrity, the DDR3 memory controller supports ECC on the data written to or read from the ECC protected address ranges in memory. Eight-bit ECC is calculated over 64-bit data quanta and provides SECDED (single error correction, double error detection) for the quanta. The system must ensure that any bursts accesses starting in the ECC protected region must not cross over into the unprotected region and vice-versa.
The ECC algorithm used in EMIF is the industry standard Hamming code (72,64)SECDED algorithm.
ARM-A15 Error Detection and Correction (ECC) Keystone Support
|L1 Data RAM||ECC (per 32 bits)||1-bit evict corrected line to L2, treat as L1D miss and refetch from L2; 2-bit detect|
|L2 Data RAM||ECC (per 64 bits)||1-bit inline correct to reader and evict (corrected); 2-bit detect|
|L1 Instruction Data RAM||Parity (per 16 bits)||1-bit detect, invalidate, and treat as cache miss (fetch from L2/DDR)|
|L1 Instruction Tag RAM||Parity||1-bit detect and treat as cache miss (fetch from L2/DDR)|
|L1 instruction BTB RAM||Parity||1-bit detect and treat as branch predictor miss|
|L1 Instruction GHB RAM||None||Error looks like prefetched the wrong address, effectively a predictor miss|
|L1 Instruction Indirect Predictor RAM||None||Error looks like prefetched the wrong address, effectively a predictor miss|
|L1 Data Tag RAM||ECC||1-bit evict corrected line to L2, treat as L1D miss and refetch from L2; 2-bit detect|
|L2 TLB RAM||Parity||TLB entry invalidate, trigger page walk|
|L2 tag RAM||ECC||1-bit correct (read-correct-write), replay lookup; 2-bit detect|
|L2 Snoop Tag RAM||ECC||1-bit correct (read-correct-write), replay lookup; 2-bit detect|
|L2 dirty RAM||ECC||1-bit evict cache line to DDR, replay load;2-bit detect|
|L2 Prefetch RAM||Parity||1-bit invalidate line|
For full details on the ARM A15 Implementation, please see ARM A-15 RTM Documentation.