how to detect ECC error in memroy testing under UEFI shell

1k Views Asked by At

I wrote a EFI binary file to test physical DIMMs under UEFI shell, the process is quite simple - first write a test pattern in to a physical address, then read it out and compare with the original pattern. However, the DIMMs might encounter correctable or uncorrectable errors. Normally all the correctable ECC would be corrected by hardware automatically and BIOS would handle this (log this error and clean the error registers), uncorrectable errors would typically caused BIOS to issue a NMI, then system hang.

The problem is my test program doesn't know error happens - correctable errors are masked by BIOS FW and uncorrectable errors make system hang...

Is there any method to let the test program know ECC error happens? I would appreciate any advice you may have. Thanks!

1

There are 1 best solutions below

0
On

I believe that to do this your program will need ultimate control of the hardware. That means it needs to boot completely and remove the EFI environment.

Once you have done that then your program can handle all of the interrupts and CPU registers that indicate ECC errors.

Once done your program would do a soft reset and that would boot the system back into EFI.