Tom,

Thanks for the heads up. I’ve been searching for a custom oid/index that may contain error counters to pull directly from the IPMI. I like the idea of the edac script but these are esx hosts. I’ll keep poking around.

 

From: observium <observium-bounces@observium.org> on behalf of Tom Laermans via observium <observium@observium.org>
Sent: Tuesday, April 2, 2019 5:24 PM
To: observium@observium.org
Cc: Tom Laermans
Subject: Re: [Observium] IPMI Errors?
 
Hi Adam,

We don't poll/scrape the BMC event log and I don't think there's any real plans. However when I implemented IPMI polling I only added voltages/fanspeed/etc but not the binary values which may or may not contain error flags. Unfortunately that's still a big "TBD" ...

However, the unix-agent has an 'edac' script which, if edac-util is installed on the host, will give you graph data on the errors detected (again, provided your chipset is supported).

save image

Yay, all zeroes for me.

(Note that you can't alert on unix-agent application values yet though, but at least you'd have some visibility through Observium)

Tom

On 4/2/2019 8:55 PM, Adam Ward via observium wrote:

Hey Observium Team,

 

I am not sure if IPMI on standard SuperMicro mainboards can be scraped for ECC errors and other SEL-type errors? Am I going to have to ship syslog data somewhere and act on it? We just had a DIMM throwing correctable errors but that doesn’t really show up anywhere in IPMI/SNMP as far as I can see, even after checking the SM MIBs.

 

How do others accomplish similar monitoring?

 

 

Adam Ward

Systems Engineer

Shamrock Trading Corporation

Office Phone/Fax: (913) 310-2247

Email: award@rtsfinancial.com

Website: www.shamrocktradingcorp.com

This e-mail communication (including any attachments) is intended only for use by the individual or entity named above and is considered confidential. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you should immediately stop reading this message and delete it from your system. Any unauthorized reading, distribution, copying or other use of this communication (or its attachments) is strictly prohibited.
_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium


This e-mail communication (including any attachments) is intended only for use by the individual or entity named above and is considered confidential. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you should immediately stop reading this message and delete it from your system. Any unauthorized reading, distribution, copying or other use of this communication (or its attachments) is strictly prohibited.