some devices takes very long to complete /opt/observium/poller.php process
![](https://secure.gravatar.com/avatar/8c5e19ba03ee2e75c41ea83a270be0be.jpg?s=120&d=mm&r=g)
Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for column 'process_name' at row 1 cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time)) INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id) /usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1 INFO: starting alerter.php for 320 INFO: finished alerter.php for 320 WARNING: worker Thread-1 finished device 320 in 405 seconds INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices in 405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads INFO: in sequential style processing the elapsed time would have been: 405 seconds WARNING: device 320 is taking too long: 405 seconds ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do. Number of rows updated: 1 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper_count.rrd N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s] o Caching Oids Used full table ifEntry/ifXEntry snmpwalk. ifEntry CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry] ...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
![](https://secure.gravatar.com/avatar/11b54b3dd25b712395dab9818c67596f.jpg?s=120&d=mm&r=g)
It means your device is slow and taking too long to return the requested data.
You've not actually really included any useful information though. What is the device?
adam. On 2017-11-02 16:34:36, Edvinas K edvinas.email@gmail.com wrote: Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for column 'process_name' at row 1 cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time)) INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id) /usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1 INFO: starting alerter.php for 320 INFO: finished alerter.php for 320 WARNING: worker Thread-1 finished device 320 in 405 seconds INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices in 405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads INFO: in sequential style processing the elapsed time would have been: 405 seconds WARNING: device 320 is taking too long: 405 seconds ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do. Number of rows updated: 1 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper_count.rrd N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s] o Caching Oids Used full table ifEntry/ifXEntry snmpwalk. ifEntry CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry] ...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/8c5e19ba03ee2e75c41ea83a270be0be.jpg?s=120&d=mm&r=g)
Sorry, it's the Cisco Nexus 6000 Switch.
Maybe it could be due to the distance to the device ? (the round trip from server to device is - 275ms.)
On Thu, Nov 2, 2017 at 6:49 PM, Adam Armstrong adama@observium.org wrote:
It means your device is slow and taking too long to return the requested data.
You've not actually really included any useful information though. What is the device?
adam.
On 2017-11-02 16:34:36, Edvinas K edvinas.email@gmail.com wrote: Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for column 'process_name' at row 1 cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time)) INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id) /usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1 INFO: starting alerter.php for 320 INFO: finished alerter.php for 320 WARNING: worker Thread-1 finished device 320 in 405 seconds INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices in 405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads INFO: in sequential style processing the elapsed time would have been: 405 seconds WARNING: device 320 is taking too long: 405 seconds ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do. Number of rows updated: 1 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper_count.rrd N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s] o Caching Oids Used full table ifEntry/ifXEntry snmpwalk. ifEntry CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry] ...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/ cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/0fa97865a0e1ab36152b6b2299eedb49.jpg?s=120&d=mm&r=g)
Yeah, that'd cause it.
You can see what sort of max-rep value this device can handle and then enable that in its settings.
Cisco stuff usually handles 1000 pretty well (defualt is 20!), But ofc nxos is trash.
Adam.
Sent from BlueMail
On 2 Nov 2017, 16:52, at 16:52, Edvinas K edvinas.email@gmail.com wrote:
Sorry, it's the Cisco Nexus 6000 Switch.
Maybe it could be due to the distance to the device ? (the round trip from server to device is - 275ms.)
On Thu, Nov 2, 2017 at 6:49 PM, Adam Armstrong adama@observium.org wrote:
It means your device is slow and taking too long to return the
requested
data.
You've not actually really included any useful information though.
What is
the device?
adam.
On 2017-11-02 16:34:36, Edvinas K edvinas.email@gmail.com wrote: Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for
column
'process_name' at row 1 cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time)) INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id) /usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1 INFO: starting alerter.php for 320 INFO: finished alerter.php for 320 WARNING: worker Thread-1 finished device 320 in 405 seconds INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices
in
405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need
faster
hardware or more threads INFO: in sequential style processing the elapsed time would have
been: 405
seconds WARNING: device 320 is taking too long: 405 seconds ERROR: Some devices are taking more than 300 seconds, the script
cannot
recommend you what to do. Number of rows updated: 1 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16 DEBUG: /usr/bin/rrdtool update
/opt/observium/rrd/poller-wrapper_count.rrd
N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s] o Caching Oids Used full table ifEntry/ifXEntry snmpwalk. ifEntry CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry] ...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
_______________________________________________ observium mailing
list
observium@observium.org http://postman.memetic.org/ cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/49aa8a8b3ac1a8e3f5349553be282a12.jpg?s=120&d=mm&r=g)
That’s perfectly ‘normal’ for these devices, as it creates virtual interfaces for each vnic/vfc in each blade / server in the fabric, the amount of interfaces to poll add up very quickly
We see the same on 5k platform when you have a few fex’s attached. Both of them averaging a little less than 5 minute poll time.
Also, having 3500 vlans (thus thousands of macs) on them as we do does not help
(touchpad drawings ftw)
Kind regards
From: observium [mailto:observium-bounces@observium.org] On Behalf Of Edvinas K Sent: donderdag 2 november 2017 17:52 To: Observium observium@observium.org Subject: Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process
Sorry, it's the Cisco Nexus 6000 Switch.
Maybe it could be due to the distance to the device ? (the round trip from server to device is - 275ms.)
On Thu, Nov 2, 2017 at 6:49 PM, Adam Armstrong <adama@observium.org mailto:adama@observium.org > wrote:
It means your device is slow and taking too long to return the requested data.
You've not actually really included any useful information though. What is the device?
adam.
On 2017-11-02 16:34:36, Edvinas K <edvinas.email@gmail.com mailto:edvinas.email@gmail.com > wrote:
Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for column 'process_name' at row 1
cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time))
INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads
WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id)
/usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1
INFO: starting alerter.php for 320
INFO: finished alerter.php for 320
WARNING: worker Thread-1 finished device 320 in 405 seconds
INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices in 405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads
INFO: in sequential style processing the elapsed time would have been: 405 seconds
WARNING: device 320 is taking too long: 405 seconds
ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do.
Number of rows updated: 1
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper_count.rrd N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s]
o Caching Oids Used full table ifEntry/ifXEntry snmpwalk.
ifEntry
CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry]
...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
_______________________________________________ observium mailing list observium@observium.org mailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C858c165fae7e442051b308d522121db2%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452383558071831&sdata=pUTbsOvkAiNlCsmh6GZ5C828p%2FGNwS4NtFW5iK9AE7s%3D&reserved=0
_______________________________________________ observium mailing list observium@observium.org mailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C858c165fae7e442051b308d522121db2%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452383558071831&sdata=pUTbsOvkAiNlCsmh6GZ5C828p%2FGNwS4NtFW5iK9AE7s%3D&reserved=0
![](https://secure.gravatar.com/avatar/9113800bbd271c46f4585a9549d85c15.jpg?s=120&d=mm&r=g)
What version of nx-os are you running? 5.x and below sucks att SNMP, 6.x is ok bit 7.x actually works. /Markus
Den 2 nov. 2017 18:44 skrev "Stef Renders" stef.renders@cronos.be:
That’s perfectly ‘normal’ for these devices, as it creates virtual interfaces for each vnic/vfc in each blade / server in the fabric, the amount of interfaces to poll add up very quickly
We see the same on 5k platform when you have a few fex’s attached. Both of them averaging a little less than 5 minute poll time.
Also, having 3500 vlans (thus thousands of macs) on them as we do does not help
(touchpad drawings ftw)
Kind regards
*From:* observium [mailto:observium-bounces@observium.org] *On Behalf Of *Edvinas K *Sent:* donderdag 2 november 2017 17:52 *To:* Observium observium@observium.org *Subject:* Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process
Sorry, it's the Cisco Nexus 6000 Switch.
Maybe it could be due to the distance to the device ? (the round trip from server to device is - 275ms.)
On Thu, Nov 2, 2017 at 6:49 PM, Adam Armstrong adama@observium.org wrote:
It means your device is slow and taking too long to return the requested data.
You've not actually really included any useful information though. What is the device?
adam.
On 2017-11-02 16:34:36, Edvinas K edvinas.email@gmail.com wrote:
Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for column 'process_name' at row 1
cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time))
INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads
WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id)
/usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1
INFO: starting alerter.php for 320
INFO: finished alerter.php for 320
WARNING: worker Thread-1 finished device 320 in 405 seconds
INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices in 405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads
INFO: in sequential style processing the elapsed time would have been: 405 seconds
WARNING: device 320 is taking too long: 405 seconds
ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do.
Number of rows updated: 1
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper_count.rrd N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s]
o Caching Oids Used full table ifEntry/ifXEntry snmpwalk.
ifEntry
CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry]
...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/ cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C858c165fae7e442051b308d522121db2%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452383558071831&sdata=pUTbsOvkAiNlCsmh6GZ5C828p%2FGNwS4NtFW5iK9AE7s%3D&reserved=0
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C858c165fae7e442051b308d522121db2%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452383558071831&sdata=pUTbsOvkAiNlCsmh6GZ5C828p%2FGNwS4NtFW5iK9AE7s%3D&reserved=0
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/0fa97865a0e1ab36152b6b2299eedb49.jpg?s=120&d=mm&r=g)
""'works"""
Adam.
Sent from BlueMail
On 2 Nov 2017, 18:52, at 18:52, Markus Klock markus@best-practice.se wrote:
What version of nx-os are you running? 5.x and below sucks att SNMP, 6.x is ok bit 7.x actually works. /Markus
Den 2 nov. 2017 18:44 skrev "Stef Renders" stef.renders@cronos.be:
That’s perfectly ‘normal’ for these devices, as it creates virtual interfaces for each vnic/vfc in each blade / server in the fabric,
the
amount of interfaces to poll add up very quickly
We see the same on 5k platform when you have a few fex’s attached.
Both of
them averaging a little less than 5 minute poll time.
Also, having 3500 vlans (thus thousands of macs) on them as we do
does not
help
(touchpad drawings ftw)
Kind regards
*From:* observium [mailto:observium-bounces@observium.org] *On Behalf
Of *Edvinas
K *Sent:* donderdag 2 november 2017 17:52 *To:* Observium observium@observium.org *Subject:* Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process
Sorry, it's the Cisco Nexus 6000 Switch.
Maybe it could be due to the distance to the device ? (the round trip from server to device is - 275ms.)
On Thu, Nov 2, 2017 at 6:49 PM, Adam Armstrong adama@observium.org wrote:
It means your device is slow and taking too long to return the
requested
data.
You've not actually really included any useful information though.
What is
the device?
adam.
On 2017-11-02 16:34:36, Edvinas K edvinas.email@gmail.com wrote:
Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for
column
'process_name' at row 1
cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time))
INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads
WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id)
/usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1
INFO: starting alerter.php for 320
INFO: finished alerter.php for 320
WARNING: worker Thread-1 finished device 320 in 405 seconds
INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices
in
405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need
faster
hardware or more threads
INFO: in sequential style processing the elapsed time would have
been: 405
seconds
WARNING: device 320 is taking too long: 405 seconds
ERROR: Some devices are taking more than 300 seconds, the script
cannot
recommend you what to do.
Number of rows updated: 1
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16
DEBUG: /usr/bin/rrdtool update
/opt/observium/rrd/poller-wrapper_count.rrd
N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s]
o Caching Oids Used full table ifEntry/ifXEntry snmpwalk.
ifEntry
CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry]
...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
_______________________________________________ observium mailing
list
observium@observium.org http://postman.memetic.org/ cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/49aa8a8b3ac1a8e3f5349553be282a12.jpg?s=120&d=mm&r=g)
Unfortunately the UCS builds (so 6k platform) use a ghetto 5.0 release (yes up until this day on their latest build with their latest platform)
Our 5k’s are on 6 and our 7k’s are on 7 (yes we see a significant improvement on the 7)
Kind regards
From: observium [mailto:observium-bounces@observium.org] On Behalf Of Adam Armstrong Sent: donderdag 2 november 2017 20:14 To: 'Observium' observium@observium.org Cc: 'Observium' observium@observium.org Subject: Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process
""'works"""
Adam.
On 2 Nov 2017, at 18:52, Markus Klock <markus@best-practice.se mailto:markus@best-practice.se > wrote:
What version of nx-os are you running?
5.x and below sucks att SNMP, 6.x is ok bit 7.x actually works.
/Markus
Den 2 nov. 2017 18:44 skrev "Stef Renders" <stef.renders@cronos.be mailto:stef.renders@cronos.be >:
That’s perfectly ‘normal’ for these devices, as it creates virtual interfaces for each vnic/vfc in each blade / server in the fabric, the amount of interfaces to poll add up very quickly
We see the same on 5k platform when you have a few fex’s attached. Both of them averaging a little less than 5 minute poll time.
Also, having 3500 vlans (thus thousands of macs) on them as we do does not help
(touchpad drawings ftw)
Kind regards
From: observium [mailto:observium-bounces@observium.org mailto:observium-bounces@observium.org ] On Behalf Of Edvinas K Sent: donderdag 2 november 2017 17:52 To: Observium <observium@observium.org mailto:observium@observium.org > Subject: Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process
Sorry, it's the Cisco Nexus 6000 Switch.
Maybe it could be due to the distance to the device ? (the round trip from server to device is - 275ms.)
On Thu, Nov 2, 2017 at 6:49 PM, Adam Armstrong <adama@observium.org mailto:adama@observium.org > wrote:
It means your device is slow and taking too long to return the requested data.
You've not actually really included any useful information though. What is the device?
adam.
On 2017-11-02 16:34:36, Edvinas K <edvinas.email@gmail.com mailto:edvinas.email@gmail.com > wrote:
Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for column 'process_name' at row 1
cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time))
INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads
WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id)
/usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1
INFO: starting alerter.php for 320
INFO: finished alerter.php for 320
WARNING: worker Thread-1 finished device 320 in 405 seconds
INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices in 405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads
INFO: in sequential style processing the elapsed time would have been: 405 seconds
WARNING: device 320 is taking too long: 405 seconds
ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do.
Number of rows updated: 1
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper_count.rrd N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s]
o Caching Oids Used full table ifEntry/ifXEntry snmpwalk.
ifEntry
CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry]
...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
_______________________________________________ observium mailing list observium@observium.org mailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C858c165fae7e442051b308d522121db2%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452383558071831&sdata=pUTbsOvkAiNlCsmh6GZ5C828p%2FGNwS4NtFW5iK9AE7s%3D&reserved=0
_______________________________________________ observium mailing list observium@observium.org mailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C858c165fae7e442051b308d522121db2%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452383558071831&sdata=pUTbsOvkAiNlCsmh6GZ5C828p%2FGNwS4NtFW5iK9AE7s%3D&reserved=0
_______________________________________________ observium mailing list observium@observium.org mailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C93824e26319c45872c6308d52226132a%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452469283639161&sdata=nyzn9LpxczYIg7nzPI%2FDGpTnKvw8W%2BuxLoBq3x%2BGn8g%3D&reserved=0
_____
observium mailing list observium@observium.org mailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C93824e26319c45872c6308d52226132a%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452469283639161&sdata=nyzn9LpxczYIg7nzPI%2FDGpTnKvw8W%2BuxLoBq3x%2BGn8g%3D&reserved=0
![](https://secure.gravatar.com/avatar/8c5e19ba03ee2e75c41ea83a270be0be.jpg?s=120&d=mm&r=g)
Thank you, guys for information.
What is the command (or syntax) to see some kind of number ? cause i didint find anything on NX-OS what comes to snmp processing The NX-OS version is: System version: 7.1(0)N1(1a)
On Thu, Nov 2, 2017 at 9:45 PM, Stef Renders stef.renders@cronos.be wrote:
Unfortunately the UCS builds (so 6k platform) use a ghetto 5.0 release (yes up until this day on their latest build with their latest platform)
Our 5k’s are on 6 and our 7k’s are on 7 (yes we see a significant improvement on the 7)
Kind regards
*From:* observium [mailto:observium-bounces@observium.org] *On Behalf Of *Adam Armstrong *Sent:* donderdag 2 november 2017 20:14 *To:* 'Observium' observium@observium.org *Cc:* 'Observium' observium@observium.org
*Subject:* Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process
""'works"""
Adam.
On 2 Nov 2017, at 18:52, Markus Klock markus@best-practice.se wrote:
What version of nx-os are you running?
5.x and below sucks att SNMP, 6.x is ok bit 7.x actually works.
/Markus
Den 2 nov. 2017 18:44 skrev "Stef Renders" stef.renders@cronos.be:
That’s perfectly ‘normal’ for these devices, as it creates virtual interfaces for each vnic/vfc in each blade / server in the fabric, the amount of interfaces to poll add up very quickly
We see the same on 5k platform when you have a few fex’s attached. Both of them averaging a little less than 5 minute poll time.
Also, having 3500 vlans (thus thousands of macs) on them as we do does not help
(touchpad drawings ftw)
Kind regards
*From:* observium [mailto:observium-bounces@observium.org] *On Behalf Of *Edvinas K *Sent:* donderdag 2 november 2017 17:52 *To:* Observium observium@observium.org *Subject:* Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process
Sorry, it's the Cisco Nexus 6000 Switch.
Maybe it could be due to the distance to the device ? (the round trip from server to device is - 275ms.)
On Thu, Nov 2, 2017 at 6:49 PM, Adam Armstrong adama@observium.org wrote:
It means your device is slow and taking too long to return the requested data.
You've not actually really included any useful information though. What is the device?
adam.
On 2017-11-02 16:34:36, Edvinas K edvinas.email@gmail.com wrote:
Hello,
Some devices takes very long to complete /opt/observium/poller.php process.
For example i took one of them and run:
/opt/observium/poller-wrapper.py --host hk2net01a -d
/opt/observium/poller-wrapper.py:568: Warning: Data truncated for column 'process_name' at row 1
cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time))
INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads
WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id)
/usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1
INFO: starting alerter.php for 320
INFO: finished alerter.php for 320
WARNING: worker Thread-1 finished device 320 in 405 seconds
INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices in 405.53 seconds with 16 threads, load average (5min) 1.67
WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads
INFO: in sequential style processing the elapsed time would have been: 405 seconds
WARNING: device 320 is taking too long: 405 seconds
ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do.
Number of rows updated: 1
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16
DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper_count.rrd N:1
The debug txt shows that it freezes for about 4mins on this line:
"SQL RUNTIME[0.00020313s]
o Caching Oids Used full table ifEntry/ifXEntry snmpwalk.
ifEntry
CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry]
...................."
and almost after 5mins it resumes. What could be the reason ?
Thanks.
_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/ cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C858c165fae7e442051b308d522121db2%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452383558071831&sdata=pUTbsOvkAiNlCsmh6GZ5C828p%2FGNwS4NtFW5iK9AE7s%3D&reserved=0
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C858c165fae7e442051b308d522121db2%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452383558071831&sdata=pUTbsOvkAiNlCsmh6GZ5C828p%2FGNwS4NtFW5iK9AE7s%3D&reserved=0
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C93824e26319c45872c6308d52226132a%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452469283639161&sdata=nyzn9LpxczYIg7nzPI%2FDGpTnKvw8W%2BuxLoBq3x%2BGn8g%3D&reserved=0
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.memetic.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fobservium&data=02%7C01%7CStef.Renders%40arxus.eu%7C93824e26319c45872c6308d52226132a%7C49c3d703357947bfa8887c913fbdced9%7C0%7C0%7C636452469283639161&sdata=nyzn9LpxczYIg7nzPI%2FDGpTnKvw8W%2BuxLoBq3x%2BGn8g%3D&reserved=0
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/11b54b3dd25b712395dab9818c67596f.jpg?s=120&d=mm&r=g)
You max-rep is set with -Cr on the commandline, you can try various values and then set them in the observium settings for the device.
time snmpbulkwalk -v2c -c <community> -Cr<value> <hostname> ifEntry
You want it to not fail and take the shortest amount of time. Default max-rep without a setting is 20, you'll probably get good results at 200 or 300 or so, if the device's snmp stack isn't horrible.
adam. On 2017-11-03 07:38:28, Edvinas K edvinas.email@gmail.com wrote: Thank you, guys for information.
What is the command (or syntax) to see some kind of number ? cause i didint find anything on NX-OS what comes to snmp processing The NX-OS version is: System version: 7.1(0)N1(1a)
On Thu, Nov 2, 2017 at 9:45 PM, Stef Renders <stef.renders@cronos.be [mailto:stef.renders@cronos.be]> wrote:
Unfortunately the UCS builds (so 6k platform) use a ghetto 5.0 release (yes up until this day on their latest build with their latest platform) Our 5k’s are on 6 and our 7k’s are on 7 (yes we see a significant improvement on the 7) Kind regards From: observium [mailto:observium-bounces@observium.org [mailto:observium-bounces@observium.org]] On Behalf Of Adam Armstrong Sent: donderdag 2 november 2017 20:14 To: 'Observium' <observium@observium.org [mailto:observium@observium.org]> Cc: 'Observium' <observium@observium.org [mailto:observium@observium.org]>
Subject: Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process ""'works""" Adam. Sent from BlueMail [https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bluema...] On 2 Nov 2017, at 18:52, Markus Klock <markus@best-practice.se [mailto:markus@best-practice.se]> wrote: What version of nx-os are you running? 5.x and below sucks att SNMP, 6.x is ok bit 7.x actually works. /Markus Den 2 nov. 2017 18:44 skrev "Stef Renders" <stef.renders@cronos.be [mailto:stef.renders@cronos.be]>: That’s perfectly ‘normal’ for these devices, as it creates virtual interfaces for each vnic/vfc in each blade / server in the fabric, the amount of interfaces to poll add up very quickly We see the same on 5k platform when you have a few fex’s attached. Both of them averaging a little less than 5 minute poll time. Also, having 3500 vlans (thus thousands of macs) on them as we do does not help (touchpad drawings ftw) Kind regards From: observium [mailto:observium-bounces@observium.org [mailto:observium-bounces@observium.org]] On Behalf Of Edvinas K Sent: donderdag 2 november 2017 17:52 To: Observium <observium@observium.org [mailto:observium@observium.org]> Subject: Re: [Observium] some devices takes very long to complete /opt/observium/poller.php process Sorry, it's the Cisco Nexus 6000 Switch. Maybe it could be due to the distance to the device ? (the round trip from server to device is - 275ms.) On Thu, Nov 2, 2017 at 6:49 PM, Adam Armstrong <adama@observium.org [mailto:adama@observium.org]> wrote: It means your device is slow and taking too long to return the requested data. You've not actually really included any useful information though. What is the device? adam. On 2017-11-02 16:34:36, Edvinas K <edvinas.email@gmail.com [mailto:edvinas.email@gmail.com]> wrote: Hello, Some devices takes very long to complete /opt/observium/poller.php process. For example i took one of them and run: /opt/observium/poller-wrapper.py --host hk2net01a -d /opt/observium/poller-wrapper.py:568: Warning: Data truncated for column 'process_name' at row 1 cursor.execute(p_query, (pid,ppid,processname,uid,command,s_time)) INFO: starting the poller at 2017/11/02 17:10:50 with 16 threads WARNING: DEBUG enabled, each device poller store output to /tmp/observium_poller_id.debug (where id is device_id) /usr/bin/env php /opt/observium/poller.php -d -h 320 >> /tmp/observium_poller_320.debug 2>&1 INFO: starting alerter.php for 320 INFO: finished alerter.php for 320 WARNING: worker Thread-1 finished device 320 in 405 seconds INFO: poller-wrapper.py poller --host hk2net01a processed 1 devices in 405.53 seconds with 16 threads, load average (5min) 1.67 WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads INFO: in sequential style processing the elapsed time would have been: 405 seconds WARNING: device 320 is taking too long: 405 seconds ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do. Number of rows updated: 1 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper.rrd N:1:405.531:16 DEBUG: /usr/bin/rrdtool update /opt/observium/rrd/poller-wrapper_count.rrd N:1 The debug txt shows that it freezes for about 4mins on this line: "SQL RUNTIME[0.00020313s] o Caching Oids Used full table ifEntry/ifXEntry snmpwalk. ifEntry CMD[/usr/bin/snmpbulkwalk -v2c -c *** -Pu -OQUs -m IF-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hk2net01a':'161' ifEntry] ...................." and almost after 5mins it resumes. What could be the reason ? Thanks. _______________________________________________ observium mailing list observium@observium.org [mailto:observium@observium.org] http://postman.memetic.org/cgi-bin/mailman/listinfo/observium [https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.me...]
_______________________________________________ observium mailing list observium@observium.org [mailto:observium@observium.org] http://postman.memetic.org/cgi-bin/mailman/listinfo/observium [https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.me...]
_______________________________________________ observium mailing list observium@observium.org [mailto:observium@observium.org] http://postman.memetic.org/cgi-bin/mailman/listinfo/observium [https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.me...]
observium mailing list observium@observium.org [mailto:observium@observium.org] http://postman.memetic.org/cgi-bin/mailman/listinfo/observium [https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpostman.me...]
_______________________________________________ observium mailing list observium@observium.org [mailto:observium@observium.org] http://postman.memetic.org/cgi-bin/mailman/listinfo/observium [http://postman.memetic.org/cgi-bin/mailman/listinfo/observium]
_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
participants (5)
-
Adam Armstrong
-
Adam Armstrong
-
Edvinas K
-
Markus Klock
-
Stef Renders