How to reset ECC error counters on Dell running Windows?

(Is your server not living up to its potential? Order a server from us with promo code PACKETS for 15% off your first invoice)

1. Download and Install Dell OpenManage for Windows. You will only need command line toolsat this point, Web-based management tools / SNMP tools can wait.

2. After installing Dell OpenManage go to Windows command line and paste (and push enter) this command to see your DIMMs status:

omreport chassis memory

The result should be similar to:

Memory Information

Health : Critical

Memory Redundancy

Fail Over State          : Inactive
Redundancy Configuration : Disabled

Attributes of Memory Array(s)

Attributes of Memory Array(s)
Location           : System Board or Motherboard
Use                : System Memory
Installed Capacity : 16384  MB
Maximum Capacity   : 65280  MB
Slots Available    : 8
Slots Used         : 8
Error Correction   : Multibit ECC

Total of Memory Array(s)
Total Installed Capacity                     : 16384  MB
Total Installed Capacity Available to the OS : 16046  MB
Total Maximum Capacity                       : 65280  MB

Details of Memory Array 1
Index          : 0
Status         : Critical
Connector Name : DIMM1
Type           : DDR2 FB-DIMM - Synchronous
Size           : 2048  MB

Index          : 1
Status         : Ok
Connector Name : DIMM2
Type           : DDR2 FB-DIMM - Synchronous
Size           : 2048  MB

<------------- CUT -------------->


Index          : 7
Status         : Ok
Connector Name : DIMM8
Type           : DDR2 FB-DIMM - Synchronous
Size           : 2048  MB

 

If you don’t see any errors (it could take some time for errors to appear there), you could also check your System Events Log (SEL) by running the following command:

racadm getsel

3. Change your current directory to c:\Program Files\Dell\SysMgt\omsa\bin:

cd c:\Program Files\Dell\SysMgt\omsa\bin

4. Run the following command to reset ECC errors counters:

dcicfg32 command=clearmemfailures

It should print:

clearing failures using mask: 31

DIMM1 : ok
DIMM2 : ok
DIMM3 : ok
DIMM4 : ok
DIMM5 : ok
DIMM6 : ok
DIMM7 : ok
DIMM8 : ok

5. Run the following command to check if everything OK:

omreport chassis

The result should be similar to:


Main System Chassis

SEVERITY : COMPONENT
Ok       : Fans
Ok       : Intrusion
Ok       : Memory
Ok       : Power Supplies
Ok       : Processors
Ok       : Temperatures
Ok       : Voltages
Ok       : Hardware Log
Ok       : Batteries

 

6. To clear your system events log, run the following command:

racadm clrsel

 

That's it! Errors have been cleared, but it doesn't guarantee that they will not appear again. Basically, these steps do nothing but reset the error counters to their initial value.

  • 295 gebruikers vonden dit artikel nuttig
Was dit antwoord nuttig?

Gerelateerde artikelen

Advanced Network Troubleshooting: Using traceroute

(Is your server's network not living up to its potential? Order a server from us with promo code...

Advanced Network Troubleshooting: Using Telnet

(Is your server's network not living up to its potential? Order a server from us with promo code...

Advanced Network Troubleshooting: Using My Traceroute (MTR)

(Is your server's network not living up to its potential? Order a server from us with promo code...

Basic Network Troubleshooting

(Is your server's network not living up to its potential? Order a server from us with promo code...

Basic Performance Analysis

(Is your server not living up to its potential? Order a server from us with promo code PACKETS...