Forums

« Back to Technical Questions

Overall Server Status is reported as "Severe Fault"

Combination View Flat View Tree View
Threads [ Previous | Next ]
Hello team,
 
We are seeing the following error message on CIMC main page:
 
Overall Server Status is reported as "Severe Fault".
 
Fault Sensors display the following:
 
Threshold Sensors
 
Sensor Name Status Reading Units
DDR3_P1_C0_ECC Non-Recoverable Error 253 error
 
Please refer to the attached screen captures for details.
 
Look forward to hearing from you.
 
Thank you,
John
Attachments:

Hi John,

Thanks for the information. I've provided the following request for data and steps below.

1. In CIMC please click on the Inventory link, then the Memory tab and let us know the displayed values.

2. Please reload the server via the router cli 'ucse <slot>/0 reload' and let us know if the fault remains or disappears.

Thanks,
Brett

John,

I believe this error was fixed by a later version of the BMC/CIMC image.

Daniel, can you confirm that?

Brett, what's the current version of image you have given to John, is it the 0607 one?

Thanks,

Jin

From: Cisco Developer Community Forums <cdicuser@developer.cisco.com<mailto:cdicuser@developer.cisco.com>>
Reply-To: "cdicuser@developer.cisco.com<mailto:cdicuser@developer.cisco.com>" <cdicuser@developer.cisco.com<mailto:cdicuser@developer.cisco.com>>
Date: Tuesday, September 4, 2012 7:05 PM
To: "cdicuser@developer.cisco.com<mailto:cdicuser@developer.cisco.com>" <cdicuser@developer.cisco.com<mailto:cdicuser@developer.cisco.com>>
Subject: New Message from John Devavaram in Unified Computing System E-Series Servers (UCSE) - Technical Questions: Overall Server Status is reported as "Severe Fault"

John Devavaram has created a new message in the forum "Technical Questions":

--------------------------------------------------------------
Hello team,

We are seeing the following error message on CIMC main page:

Overall Server Status is reported as "Severe Fault".

Fault Sensors display the following:

Threshold Sensors

Sensor Name Status Reading Units
DDR3_P1_C0_ECC Non-Recoverable Error 253 error

Please refer to the attached screen captures for details.

Look forward to hearing from you.

Thank you,
John
--
To respond to this post, please click the following link:

<http://developer.cisco.com/web/ucse/forums/-/message_boards/view_message/6420236>

or simply reply to this email.

Hi John,

Our Engineering team says that you can ignore this error message as it is incorrect. This incorrect message is fixed in a later release which we'll post soon.

Thanks,

Brett

Brett, Jin and team,
 
The reload of the module, momentarily stops the Severe Fault from appearing for a few hours (may be).
 
But, the Severe Fault error does appear after some time even after reloading the module/ powering off the router and powering it back on.
 
Currently, the UCS-E160D-M1/K9 module is running the following versions:
 
CIMC Firmware:
Running Version: 1.0(1.20120607153607)
 
BIOS Version: 4.6.4.8
 
 

Glad to hear that it is fixed in most recent BMC/CIMC image.

 
Thank you for all your help.
 
Regards,
John