Hiya,
We've noticed that Observium is reporting that devices have rebooted even though they haven't. It's happening at the 496 day mark and so we're assuming that this is related to the 32bit SysUptime counter limitation.
I noticed though that there is a section of code which says:
// Use snmpEngineTime (68 year rollover) to cross-reference for false // positives in device rebooting due to sysUpTime rollover issues
so I presume this ought not be happening. I can appreciate that this is a difficult one to test. Or rather, it's easy to test but has a *really* long debug cycle :-)
We're currently running 0.13.2.3559.
Here's the output from a device which observium recently reported as having rebooted:
SNMP-FRAMEWORK-MIB::snmpEngineTime.0 = INTEGER: 43016530 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (6689166) 18:34:51.66
If you want any more information just ask. Given the number of devices we have with uptimes around 496 days we can probably do a test every few days.
Thanks,
Mike
Hi Mike,
please update to r3621 and check that UpTime is now displayed correctly. :)
On Wed, Feb 20, 2013 at 3:19 PM, Mike Richardson mike.richardson@manchester.ac.uk wrote:
Hiya,
We've noticed that Observium is reporting that devices have rebooted even though they haven't. It's happening at the 496 day mark and so we're assuming that this is related to the 32bit SysUptime counter limitation.
I noticed though that there is a section of code which says:
// Use snmpEngineTime (68 year rollover) to cross-reference for false // positives in device rebooting due to sysUpTime rollover issues
so I presume this ought not be happening. I can appreciate that this is a difficult one to test. Or rather, it's easy to test but has a *really* long debug cycle :-)
We're currently running 0.13.2.3559.
Here's the output from a device which observium recently reported as having rebooted:
SNMP-FRAMEWORK-MIB::snmpEngineTime.0 = INTEGER: 43016530 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (6689166) 18:34:51.66
If you want any more information just ask. Given the number of devices we have with uptimes around 496 days we can probably do a test every few days.
Thanks,
Mike
-- Mike Richardson Networks (network@manchester.ac.uk) IT Services, University of Manchester *Plain text only please - attachments stripped on arrival* _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
On Thu, Feb 21, 2013 at 11:49:23AM +0400, Mike Stupalov wrote:
Hi Mike,
please update to r3621 and check that UpTime is now displayed correctly. :)
Thanks. I've upgraded and the next device to pass the mark will be tomorrow morning. I'll let you know what happens.
Mike
On Thu, Feb 21, 2013 at 11:05:47AM +0000, Mike Richardson wrote:
On Thu, Feb 21, 2013 at 11:49:23AM +0400, Mike Stupalov wrote:
Hi Mike,
please update to r3621 and check that UpTime is now displayed correctly. :)
Thanks. I've upgraded and the next device to pass the mark will be tomorrow morning. I'll let you know what happens.
Ok, looking good. No reports for this morning's device. There will be another this evening so hopefully that will be confirmation. Some devices did reboot today and they did show up so that side still works.
Thanks,
Mike
On Thu, Feb 21, 2013 at 11:49:23AM +0400, Mike Stupalov wrote:
Hi Mike,
please update to r3621 and check that UpTime is now displayed correctly. :)
Hiya,
Running 0.13.3.3677 and the problem with the SysUptime seems to have come back.
Thanks,
Mike
On Wed, Feb 20, 2013 at 3:19 PM, Mike Richardson mike.richardson@manchester.ac.uk wrote:
Hiya,
We've noticed that Observium is reporting that devices have rebooted even though they haven't. It's happening at the 496 day mark and so we're assuming that this is related to the 32bit SysUptime counter limitation.
I noticed though that there is a section of code which says:
// Use snmpEngineTime (68 year rollover) to cross-reference for false // positives in device rebooting due to sysUpTime rollover issues
so I presume this ought not be happening. I can appreciate that this is a difficult one to test. Or rather, it's easy to test but has a *really* long debug cycle :-)
We're currently running 0.13.2.3559.
Here's the output from a device which observium recently reported as having rebooted:
SNMP-FRAMEWORK-MIB::snmpEngineTime.0 = INTEGER: 43016530 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (6689166) 18:34:51.66
If you want any more information just ask. Given the number of devices we have with uptimes around 496 days we can probably do a test every few days.
Thanks,
Mike
-- Mike Richardson Networks (network@manchester.ac.uk) IT Services, University of Manchester *Plain text only please - attachments stripped on arrival* _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
-- Mike Stupalov _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
Hi Mike
Pls show php ./poller.php -d -m unix-agent,system -h <device>
On Fri, Mar 8, 2013 at 8:08 PM, Mike Richardson < mike.richardson@manchester.ac.uk> wrote:
On Thu, Feb 21, 2013 at 11:49:23AM +0400, Mike Stupalov wrote:
Hi Mike,
please update to r3621 and check that UpTime is now displayed correctly.
:)
Hiya,
Running 0.13.3.3677 and the problem with the SysUptime seems to have come back.
Thanks,
Mike
On Wed, Feb 20, 2013 at 3:19 PM, Mike Richardson mike.richardson@manchester.ac.uk wrote:
Hiya,
We've noticed that Observium is reporting that devices have rebooted
even
though they haven't. It's happening at the 496 day mark and so we're assuming that this is related to the 32bit SysUptime counter
limitation.
I noticed though that there is a section of code which says:
// Use snmpEngineTime (68 year rollover) to cross-reference for false // positives in device rebooting due to sysUpTime rollover issues
so I presume this ought not be happening. I can appreciate that this
is a
difficult one to test. Or rather, it's easy to test but has a *really*
long
debug cycle :-)
We're currently running 0.13.2.3559.
Here's the output from a device which observium recently reported as
having
rebooted:
SNMP-FRAMEWORK-MIB::snmpEngineTime.0 = INTEGER: 43016530 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (6689166) 18:34:51.66
If you want any more information just ask. Given the number of devices
we
have with uptimes around 496 days we can probably do a test every few
days.
Thanks,
Mike
-- Mike Richardson Networks (network@manchester.ac.uk) IT Services, University of Manchester *Plain text only please - attachments stripped on arrival* _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
-- Mike Stupalov _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
-- Mike Richardson Networks (network@manchester.ac.uk) IT Services, University of Manchester *Plain text only please - attachments stripped on arrival* _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
-- Mike Stupalov
On Fri, Mar 08, 2013 at 08:51:23PM +0400, Mike Stupalov wrote:
Hi Mike Pls show  php ./poller.php -d -m unix-agent,system -h <device>
Looks like the software on the 3500 doesn't support hrSystemUptime. Yet another reason to replace the 3508s with something less prehistoric.
Mike
Observium Poller v0.13.3.3677
DEBUG! Starting polling run:
SQL[SELECT `device_id` FROM `devices` WHERE `disabled` = 0 AND `hostname` LIKE 'tls-williamson.its.manchester.ac.uk' ORDER BY `device_id` ASC] SQL[SELECT * FROM `devices` WHERE `device_id` = '691'] SQL[SELECT * FROM devices_attribs WHERE `device_id` = '691'] tls-williamson.its.manchester.ac.uk 691 ios (cisco) DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -Oqv -m SNMPv2-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 sysObjectID.0 SNMPv2-SMI::enterprises.9.1.246
RRD[[32mupdate /opt/observium/rrd/tls-williamson.its.manchester.ac.uk/status.rrd N:1[0m] DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -OQUs -m SNMPv2-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 sysUpTime.0 sysLocation.0 sysContact.0 sysName.0 sysUpTime.0 = 0:6:30:59.91 sysLocation.0 = Williamson - 2.64 sysContact.0 = IT Services (0161 275 6001) [U][C] sysName.0 = tls-williamson.its.manchester.ac.uk
DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -Oqv -m SNMPv2-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 sysDescr.0 Cisco Internetwork Operating System Software IOS (tm) C3500XL Software (C3500XL-C3H2S-M), Version 12.0(5)WC17, RELEASE SOFTWARE (fc1) Copyright (c) 1986-2007 by cisco Systems, Inc. Compiled Tue 13-Feb-07 15:04 by antonino
DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -Oqvn -m SNMPv2-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 sysObjectID.0 .1.3.6.1.4.1.9.1.246
DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -Oqv -m HOST-RESOURCES-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 hrSystemUptime.0 No Such Object available on this agent at this OID
DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -OUqv -m SNMP-FRAMEWORK-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 snmpEngineTime.0 23460
Using SNMP Agent snmpEngineTime (23460 seconds) OK u:0.00 s:0.01 r:0.06 RRD[[32mupdate /opt/observium/rrd/tls-williamson.its.manchester.ac.uk/uptime.rrd N:23460[0m] Uptime: 6h 31m Polled in 0.306 seconds Array ( [uptime] => 23460 [last_polled] => Array ( [0] => NOW() )
[last_polled_timetaken] => 0.306 ) Updating tls-williamson.its.manchester.ac.uk - 1
SQL[UPDATE `devices` set `uptime` ='23460',`last_polled` =NOW(),`last_polled_timetaken` ='0.306' WHERE `device_id` = '691'] UPDATED!
SQL[INSERT INTO `perf_times` (`type`,`doing`,`start`,`duration`,`devices`) VALUES ('poll','tls-williamson.its.manchester.ac.uk','1362762404.3783','0.320','1')] ./poller.php tls-williamson.its.manchester.ac.uk March 8, 2013, 17:06 - 1 devices polled in 0.320 secs
MySQL: Cell[0/0s] Row[1/0s] Rows[1/0s] Column[0/0s] Update[1/0s] Insert[1/0s] Delete[0/0s] OK u:0.00 s:0.01 r:0.30
hrSystemUptime isn't necessary, snmpEngineTime is used.
in what shows on the device? sh ver | i uptime and sh ver | i resta
On Fri, Mar 8, 2013 at 9:14 PM, Mike Richardson < mike.richardson@manchester.ac.uk> wrote:
On Fri, Mar 08, 2013 at 08:51:23PM +0400, Mike Stupalov wrote:
Hi Mike Pls show  php ./poller.php -d -m unix-agent,system -h <device>
Looks like the software on the 3500 doesn't support hrSystemUptime. Yet another reason to replace the 3508s with something less prehistoric.
Mike
Observium Poller v0.13.3.3677
DEBUG! Starting polling run:
SQL[SELECT `device_id` FROM `devices` WHERE `disabled` = 0 AND `hostname` LIKE 'tls-williamson.its.manchester.ac.uk' ORDER BY `device_id` ASC] SQL[SELECT * FROM `devices` WHERE `device_id` = '691'] SQL[SELECT * FROM devices_attribs WHERE `device_id` = '691'] tls-williamson.its.manchester.ac.uk 691 ios (cisco) DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -Oqv -m SNMPv2-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 sysObjectID.0 SNMPv2-SMI::enterprises.9.1.246
RRD[ [32mupdate /opt/observium/rrd/ tls-williamson.its.manchester.ac.uk/status.rrd N:1 [0m] DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -OQUs -m SNMPv2-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 sysUpTime.0 sysLocation.0 sysContact.0 sysName.0 sysUpTime.0 = 0:6:30:59.91 sysLocation.0 = Williamson - 2.64 sysContact.0 = IT Services (0161 275 6001) [U][C] sysName.0 = tls-williamson.its.manchester.ac.uk
DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -Oqv -m SNMPv2-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 sysDescr.0 Cisco Internetwork Operating System Software IOS (tm) C3500XL Software (C3500XL-C3H2S-M), Version 12.0(5)WC17, RELEASE SOFTWARE (fc1) Copyright (c) 1986-2007 by cisco Systems, Inc. Compiled Tue 13-Feb-07 15:04 by antonino
DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -Oqvn -m SNMPv2-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161 sysObjectID.0 .1.3.6.1.4.1.9.1.246
DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -Oqv -m HOST-RESOURCES-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161hrSystemUptime.0 No Such Object available on this agent at this OID
DEBUG: SNMP Auth options = -v2c -c XXXXX /usr/bin/snmpget -v2c -c XXXXX -OUqv -m SNMP-FRAMEWORK-MIB -M /opt/observium/mibs udp:tls-williamson.its.manchester.ac.uk:161snmpEngineTime.0 23460
Using SNMP Agent snmpEngineTime (23460 seconds) OK u:0.00 s:0.01 r:0.06 RRD[ [32mupdate /opt/observium/rrd/ tls-williamson.its.manchester.ac.uk/uptime.rrd N:23460 [0m] Uptime: 6h 31m Polled in 0.306 seconds Array ( [uptime] => 23460 [last_polled] => Array ( [0] => NOW() )
[last_polled_timetaken] => 0.306
) Updating tls-williamson.its.manchester.ac.uk - 1
SQL[UPDATE `devices` set `uptime` ='23460',`last_polled` =NOW(),`last_polled_timetaken` ='0.306' WHERE `device_id` = '691'] UPDATED!
SQL[INSERT INTO `perf_times` (`type`,`doing`,`start`,`duration`,`devices`) VALUES ('poll','tls-williamson.its.manchester.ac.uk','1362762404.3783','0.320','1')] ./poller.php tls-williamson.its.manchester.ac.uk March 8, 2013, 17:06 - 1 devices polled in 0.320 secs
MySQL: Cell[0/0s] Row[1/0s] Rows[1/0s] Column[0/0s] Update[1/0s] Insert[1/0s] Delete[0/0s] OK u:0.00 s:0.01 r:0.30
-- Mike Richardson Networks (network@manchester.ac.uk) IT Services, University of Manchester *Plain text only please - attachments stripped on arrival* _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
On Fri, Mar 08, 2013 at 09:22:53PM +0400, Mike Stupalov wrote:
hrSystemUptime isn't necessary, snmpEngineTime is used. in what shows on the device? sh ver | i uptime and sh ver | i resta
tls-williamson uptime is 1 year, 18 weeks, 6 days, 10 hours, 35 minutes
System restarted at 08:07:34 BST Fri Oct 28 2011
Mike
This is 497 days, wraparound time of the 32bit counter. If it doesn't have the 64bit counter it will indeed look like it was just rebooted..
Tom
On 8/03/2013 19:46, Mike Richardson wrote:
On Fri, Mar 08, 2013 at 09:22:53PM +0400, Mike Stupalov wrote:
hrSystemUptime isn't necessary, snmpEngineTime is used. in what shows on the device? sh ver | i uptime and sh ver | i resta
tls-williamson uptime is 1 year, 18 weeks, 6 days, 10 hours, 35 minutes
System restarted at 08:07:34 BST Fri Oct 28 2011
Mike
No, here it is used 64bit snmpEngineTime.0, but it too shows incorrect information. Probably here already bug of the old IOS version.
other devices (учсдю 3500) show correctly?
On Fri, Mar 8, 2013 at 11:03 PM, Tom Laermans tom.laermans@powersource.cxwrote:
This is 497 days, wraparound time of the 32bit counter. If it doesn't have the 64bit counter it will indeed look like it was just rebooted..
Tom
On 8/03/2013 19:46, Mike Richardson wrote:
On Fri, Mar 08, 2013 at 09:22:53PM +0400, Mike Stupalov wrote:
hrSystemUptime isn't necessary, snmpEngineTime is used. in what shows on the device? sh ver | i uptime and sh ver | i resta
tls-williamson uptime is 1 year, 18 weeks, 6 days, 10 hours, 35 minutes
System restarted at 08:07:34 BST Fri Oct 28 2011
Mike
______________________________**_________________ observium mailing list observium@observium.org http://postman.memetic.org/**cgi-bin/mailman/listinfo/**observiumhttp://postman.memetic.org/cgi-bin/mailman/listinfo/observium
CSCeh49492 - snmpEnginetime gets reset when sysUptime counter rolls over.
On Sat, Mar 9, 2013 at 1:43 AM, Mike Stupalov landy2005@gmail.com wrote:
No, here it is used 64bit snmpEngineTime.0, but it too shows incorrect information. Probably here already bug of the old IOS version.
other devices (учсдю 3500) show correctly?
On Fri, Mar 8, 2013 at 11:03 PM, Tom Laermans <tom.laermans@powersource.cx
wrote:
This is 497 days, wraparound time of the 32bit counter. If it doesn't have the 64bit counter it will indeed look like it was just rebooted..
Tom
On 8/03/2013 19:46, Mike Richardson wrote:
On Fri, Mar 08, 2013 at 09:22:53PM +0400, Mike Stupalov wrote:
hrSystemUptime isn't necessary, snmpEngineTime is used. in what shows on the device? sh ver | i uptime and sh ver | i resta
tls-williamson uptime is 1 year, 18 weeks, 6 days, 10 hours, 35 minutes
System restarted at 08:07:34 BST Fri Oct 28 2011
Mike
______________________________**_________________ observium mailing list observium@observium.org http://postman.memetic.org/**cgi-bin/mailman/listinfo/**observiumhttp://postman.memetic.org/cgi-bin/mailman/listinfo/observium
-- Mike Stupalov
They forgot to make it 64 bit apparently.. doh. :-)
On 8/03/2013 23:07, Mike Stupalov wrote:
CSCeh49492 - snmpEnginetime gets reset when sysUptime counter rolls over.
On Sat, Mar 9, 2013 at 1:43 AM, Mike Stupalov <landy2005@gmail.com mailto:landy2005@gmail.com> wrote:
No, here it is used 64bit snmpEngineTime.0, but it too shows incorrect information. Probably here already bug of the old IOS version. other devices (????? 3500) show correctly? On Fri, Mar 8, 2013 at 11:03 PM, Tom Laermans <tom.laermans@powersource.cx <mailto:tom.laermans@powersource.cx>> wrote: This is 497 days, wraparound time of the 32bit counter. If it doesn't have the 64bit counter it will indeed look like it was just rebooted.. Tom On 8/03/2013 19:46, Mike Richardson wrote: On Fri, Mar 08, 2013 at 09:22:53PM +0400, Mike Stupalov wrote: hrSystemUptime isn't necessary, snmpEngineTime is used. in what shows on the device? sh ver | i uptime and sh ver | i resta tls-williamson uptime is 1 year, 18 weeks, 6 days, 10 hours, 35 minutes System restarted at 08:07:34 BST Fri Oct 28 2011 Mike _______________________________________________ observium mailing list observium@observium.org <mailto:observium@observium.org> http://postman.memetic.org/cgi-bin/mailman/listinfo/observium -- Mike Stupalov
-- Mike Stupalov
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
participants (4)
-
Mike Richardson
-
Mike Stupalov
-
Nikolay Shopik
-
Tom Laermans