This e-mail communication (including any attachments) is intended only for use by the individual or entity named above and is considered confidential. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you should immediately stop reading this message and delete it from your system. Any unauthorized reading, distribution, copying or other use of this communication (or its attachments) is strictly prohibited.Tom,
Thanks for the heads up. I’ve been searching for a custom oid/index that may contain error counters to pull directly from the IPMI. I like the idea of the edac script but these are esx hosts. I’ll keep poking around.
From: observium <observium-bounces@observium.org> on behalf of Tom Laermans via observium <observium@observium.org>
Sent: Tuesday, April 2, 2019 5:24 PM
To: observium@observium.org
Cc: Tom Laermans
Subject: Re: [Observium] IPMI Errors?Hi Adam,
We don't poll/scrape the BMC event log and I don't think there's any real plans. However when I implemented IPMI polling I only added voltages/fanspeed/etc but not the binary values which may or may not contain error flags. Unfortunately that's still a big "TBD" ...
However, the unix-agent has an 'edac' script which, if edac-util is installed on the host, will give you graph data on the errors detected (again, provided your chipset is supported).
Yay, all zeroes for me.
(Note that you can't alert on unix-agent application values yet though, but at least you'd have some visibility through Observium)
Tom
On 4/2/2019 8:55 PM, Adam Ward via observium wrote:
This e-mail communication (including any attachments) is intended only for use by the individual or entity named above and is considered confidential. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you should immediately stop reading this message and delete it from your system. Any unauthorized reading, distribution, copying or other use of this communication (or its attachments) is strictly prohibited.Hey Observium Team,
I am not sure if IPMI on standard SuperMicro mainboards can be scraped for ECC errors and other SEL-type errors? Am I going to have to ship syslog data somewhere and act on it? We just had a DIMM throwing correctable errors but that doesn’t really show up anywhere in IPMI/SNMP as far as I can see, even after checking the SM MIBs.
How do others accomplish similar monitoring?
Adam Ward
Systems Engineer
Shamrock Trading Corporation
Office Phone/Fax: (913) 310-2247
Email: award@rtsfinancial.com
Website: www.shamrocktradingcorp.com
_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium