Polling Taking Longer than 5 minutes for a few devices

Chip Pleasants

3 Mar 2014 3 Mar '14

9:25 p.m.

I have a few devices that take longer than 5 minutes to poll. Looking at the polling information Observium provides it reports 999.99s for two of the three and the third one reports 975.37s under Last Polled. I believe the result of taking to long o poll is the breaks I see in my graphs for only these three devices.

These boxes have about 150 vlan interfaces and are fully populated 6509s. I have adjusted the poller-wrapper.py to 10 and $config['snmp']['max-rep'] = 10 right now. A 4 processor VM averages about 60 cpu utilization constantly. The Server's Disk IO Ops/sec average is about 110. Today I only have 52 out of 1300 devices loaded into it.

Any suggestions to speed up the polling on these few boxes or is the only solution add another polling server and do odd even or something of that nature. Maybe there is another issue I'm not seeing? Thanks in advance for any assistance/direction.

-Chip

Attachments:

attachment.html (text/html — 1.2 KB)

Show replies by date

Nathan Phelan

3 Mar 3 Mar

10:57 p.m.

Hi Chip, We noticed with some devices that polling via SNMP v1 can take dramatically longer than polling via SNMP v2/3 - I'm not sure if this helps in your scenario. Cheers, Nathan

From: observium [mailto:observium-bounces@observium.org] On Behalf Of Chip Pleasants Sent: Tuesday, 4 March 2014 7:26 AM To: Observium Network Observation System Subject: [Observium] Polling Taking Longer than 5 minutes for a few devices

-Chip

Nikolay Shopik

11:43 p.m.

New subject: Polling Taking Longer than 5 minutes for a few devices

Iirc 6500 series really slow populate arp/fdb tables via snmp. So you may disable these modules to speed up

...

On 04 марта 2014 г., at 0:25, Chip Pleasants wpleasants@gmail.com wrote:

I have a few devices that take longer than 5 minutes to poll. Looking at the polling information Observium provides it reports 999.99s for two of the three and the third one reports 975.37s under Last Polled. I believe the result of taking to long o poll is the breaks I see in my graphs for only these three devices.

These boxes have about 150 vlan interfaces and are fully populated 6509s. I have adjusted the poller-wrapper.py to 10 and $config['snmp']['max-rep'] = 10 right now. A 4 processor VM averages about 60 cpu utilization constantly. The Server's Disk IO Ops/sec average is about 110. Today I only have 52 out of 1300 devices loaded into it.

Any suggestions to speed up the polling on these few boxes or is the only solution add another polling server and do odd even or something of that nature. Maybe there is another issue I'm not seeing? Thanks in advance for any assistance/direction.

-Chip

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Chip Pleasants

4 Mar 4 Mar

12:41 a.m.

New subject: Polling Taking Longer than 5 minutes for a few devices

I confirmed I'm only polling using v2c. Can I disable features per device? I'm guessing it would it be under the Device Details and Poller Modules (fdb-table and arp-table)?

-Chip

On Mon, Mar 3, 2014 at 5:43 PM, Nikolay Shopik shopik@inblock.ru wrote:

...

Iirc 6500 series really slow populate arp/fdb tables via snmp. So you may disable these modules to speed up

...
On 04 марта 2014 г., at 0:25, Chip Pleasants wpleasants@gmail.com

wrote:

...
I have a few devices that take longer than 5 minutes to poll. Looking

at the polling information Observium provides it reports 999.99s for two of the three and the third one reports 975.37s under Last Polled. I believe the result of taking to long o poll is the breaks I see in my graphs for only these three devices.

...
These boxes have about 150 vlan interfaces and are fully populated

6509s. I have adjusted the poller-wrapper.py to 10 and $config['snmp']['max-rep'] = 10 right now. A 4 processor VM averages about 60 cpu utilization constantly. The Server's Disk IO Ops/sec average is about 110. Today I only have 52 out of 1300 devices loaded into it.

...
Any suggestions to speed up the polling on these few boxes or is the

only solution add another polling server and do odd even or something of that nature. Maybe there is another issue I'm not seeing? Thanks in advance for any assistance/direction.

...
-Chip

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Adam Armstrong

12:43 a.m.

On 2014-03-03 17:41, Chip Pleasants wrote:

...

I confirmed I'm only polling using v2c. Can I disable features per device? I'm guessing it would it be under the Device Details and Poller Modules (fdb-table and arp-table)?

Yes. On 6500s you probably want to disable these two modules as they seem to communicate with their linecards to collect them, which isn't fast.

adam.

Tom Laermans

12:53 a.m.

New subject: Polling Taking Longer than 5 minutes for a few devices

You can see which modules take how long in the device's time reports; compare notes to Nikolay's email and go from there :-)

On 03/03/2014 21:25, Chip Pleasants wrote:

...

I have a few devices that take longer than 5 minutes to poll. Looking at the polling information Observium provides it reports 999.99s for two of the three and the third one reports 975.37s under Last Polled. I believe the result of taking to long o poll is the breaks I see in my graphs for only these three devices.

These boxes have about 150 vlan interfaces and are fully populated 6509s. I have adjusted the poller-wrapper.py to 10 and $config['snmp']['max-rep'] = 10 right now. A 4 processor VM averages about 60 cpu utilization constantly. The Server's Disk IO Ops/sec average is about 110. Today I only have 52 out of 1300 devices loaded into it.

Any suggestions to speed up the polling on these few boxes or is the only solution add another polling server and do odd even or something of that nature. Maybe there is another issue I'm not seeing? Thanks in advance for any assistance/direction.

-Chip

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Chip Pleasants

2:05 a.m.

New subject: Polling Taking Longer than 5 minutes for a few devices

Thanks for the input! I didn't even know the time reports per module were even there! The down side is the ports module took 67% or 581.9524s and the fdb-table took 14% or 122.0170s to complete. Doesn't look like disabling the fdb-table module will do much good if the ports modules it taking almost 600s. These are almost fully populated 6509s, but should they really be taking 10 minutes to poll all the interfaces? Manually polling using snmpbulkwalk show the slow response from these devices. Odd part is the cpu isn't high on the RP nor is the SNMP Engine or IP SNMP processes showing high cpu. They are running pretty new code 122-33.SXJ6.

-Chip

On Mon, Mar 3, 2014 at 6:53 PM, Tom Laermans tom.laermans@powersource.cxwrote:

...

You can see which modules take how long in the device's time reports; compare notes to Nikolay's email and go from there :-)

On 03/03/2014 21:25, Chip Pleasants wrote:

I have a few devices that take longer than 5 minutes to poll. Looking at the polling information Observium provides it reports 999.99s for two of the three and the third one reports 975.37s under Last Polled. I believe the result of taking to long o poll is the breaks I see in my graphs for only these three devices.

These boxes have about 150 vlan interfaces and are fully populated 6509s. I have adjusted the poller-wrapper.py to 10 and $config['snmp']['max-rep'] = 10 right now. A 4 processor VM averages about 60 cpu utilization constantly. The Server's Disk IO Ops/sec average is about 110. Today I only have 52 out of 1300 devices loaded into it.

Any suggestions to speed up the polling on these few boxes or is the only solution add another polling server and do odd even or something of that nature. Maybe there is another issue I'm not seeing? Thanks in advance for any assistance/direction.

-Chip

observium mailing listobservium@observium.orghttp://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Chip Pleasants

6 Mar 6 Mar

3:33 p.m.

New subject: Polling Taking Longer than 5 minutes for a few devices

Turns out the disabling the fdb-table did the trick on these specific devices. It looks like the fdb-table is using BRIDGE-MIB to gather information? Is the BRIDGE-MIB used for anything else in Observium? Does the BRIDGE-MIB correspond to a "show mac address-table" on the boxes? I'm wondering what the specific OIDs its using in the BRIDGE-MIB as I have another application gather mac forwarding tables, therefore I'm wondering if it causing the same issue. Also, could someone let me know what functionality in Observium I'm losing by disable the fdb-table polling. I'm assuming I wont be able to see/search for MACs under the Ports/ARP/NDT Tables? Thank you in advance for any feedback and your time answering my barrage of questions.

Thanks, Chip

On Mon, Mar 3, 2014 at 8:05 PM, Chip Pleasants wpleasants@gmail.com wrote:

...

Thanks for the input! I didn't even know the time reports per module were even there! The down side is the ports module took 67% or 581.9524s and the fdb-table took 14% or 122.0170s to complete. Doesn't look like disabling the fdb-table module will do much good if the ports modules it taking almost 600s. These are almost fully populated 6509s, but should they really be taking 10 minutes to poll all the interfaces? Manually polling using snmpbulkwalk show the slow response from these devices. Odd part is the cpu isn't high on the RP nor is the SNMP Engine or IP SNMP processes showing high cpu. They are running pretty new code 122-33.SXJ6.

-Chip

On Mon, Mar 3, 2014 at 6:53 PM, Tom Laermans tom.laermans@powersource.cxwrote:

...
You can see which modules take how long in the device's time reports; compare notes to Nikolay's email and go from there :-)

On 03/03/2014 21:25, Chip Pleasants wrote:

I have a few devices that take longer than 5 minutes to poll. Looking at the polling information Observium provides it reports 999.99s for two of the three and the third one reports 975.37s under Last Polled. I believe the result of taking to long o poll is the breaks I see in my graphs for only these three devices.

These boxes have about 150 vlan interfaces and are fully populated 6509s. I have adjusted the poller-wrapper.py to 10 and $config['snmp']['max-rep'] = 10 right now. A 4 processor VM averages about 60 cpu utilization constantly. The Server's Disk IO Ops/sec average is about 110. Today I only have 52 out of 1300 devices loaded into it.

Any suggestions to speed up the polling on these few boxes or is the only solution add another polling server and do odd even or something of that nature. Maybe there is another issue I'm not seeing? Thanks in advance for any assistance/direction.

-Chip

observium mailing listobservium@observium.orghttp://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

4168

Age (days ago)

4171

Last active (days ago)

List overview

Download

7 comments

5 participants

tags (0)

participants (5)

Adam Armstrong
Chip Pleasants
Nathan Phelan
Nikolay Shopik
Tom Laermans