Tom Laermans skrev 2013-10-07 15:48:
Hi Sweden!
These are the points from the Belgian vote:
You have issues with your monitored machines and would best fix them; your SNMP is timing out.
I start with "we havn't change anything...", but in case it's true - this has happened on both xenserver and pure centos boxes. All of them have lost the values at the same time - and some of this boxes have not been updated or reboot in a while > 6 months. The only thing that changed is observium.
And then I start to look again and not all have lost fans etc - I still have some that are showing sensor values. They are all newer boxes.
So i started to search trough commits and testing a bit. At last I found the timeout value for snmpbulkwalk (was looking at the -Cr value first), tested with -t20 and it worked - added 20 to timeout for the device and did a new discovery - sucess fans, voltages, temperatures and storage are back!
So the older SuperMicro boxes didn't handle the short timeout.
The fact that they disappeared 3 months ago and you didn't notice probably tells me they're not very important though ;)
Nja...
/niklas