![](https://secure.gravatar.com/avatar/590a2c5dc058d57d94c619eb4ab08b6f.jpg?s=120&d=mm&r=g)
Charlie-
After reading the ports poller module in Observium, I've found it to be very inefficient. I don't need to do any tcpdump of Cacti's poller to see what the problem is, although I could certainly provide it if you were truly itching for it. Instead, I'll describe what the issue with the Observium poller is below.
I tried explaining the issue via Twitter to @observium, but the individual behind the account insisted that I didn't actually read the code and that Observium was more advanced than Cacti.
Regardless, here's why it takes so long:
Observium polls the entire ifEntry and ifXEntry MIBs on devices. This results in grabbing results from 11 OIDs that don't need to be polled for each device.
The reasoning behind this is that those 11 MIBs are duplicates either within the same MIB or between the two MIBs--except that one is a 32-bit counter and one is a 64-bit counter.
This is an expensive operation--in my tests, polling the 11 MIBs in question took about 2 minutes on average using snmpbulkwalk. My stack has grown a bit since those tests and has resulted in total polling time taking longer than five minutes.
I also suggested that the OIDs should be polled in parallel, but the individual behind @observium also seemed to feel that it's a bad idea to poll a device for multiple OIDs in parallel. To be honest, I don't know of any basis for that. Unless you have a very bad network (over-utilized, underpowered), parallel polling a single device via SNMP does not hurt the device or your network in any way. It doesn't hurt your server, either, unless you're trying to use old hardware--which Observium doesn't seem to work well with anyway. This is, of course, based on my own experience in the past working for a national ISP operating across 46 U.S. states and in both ILEC and CLEC markets and my current employer where many devices are polled simultaneously by many different diagnostic utilities for overall health management.
If you'd like to see my test results, feel free. My system and network performance is here: http://sprunge.us/jNBQ. It shows sustained bandwidth capacity of ~800Mbps, no packet loss, excellent cross-country latency, and almost no CPU, memory, or I/O load.
The results of the actual SNMP tests are here http://sprunge.us/RjKT.
I will be testing the removal of the duplicate OIDs and polling only the 64-bit OIDs as I have no equipment that is so old that it does not support them. I suspect that this will significantly reduce poll time, at least to moderately acceptable levels.
On Oct 26, 2013, at 8:58 PM, Charlie Allom charlie@evilforbeginners.com wrote:
On Sun, Oct 13, 2013 at 06:20:50PM -0400, Tyler Christiansen tylerc@beatsmusic.com wrote:
Testing Observium, I've noticed that polling an EX4200/4550 with Observium takes about 200 seconds. There are around 300 ports, and I've noticed that
Hi Tyler, I've just added about 4000 ports in EX stacks and I can correlate your findings.
Polling with a full 7 member stack of 48 port switches means polling takes around 1500-2000 seconds.
I don't have cacti installed, can you take a tcpdump or debug and share your finding in how it is polling every port? We can then compare it to ./poller.php -d -r -h $foo
C.
0x8486EDA8 http://spodder.com/ _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium