Hi Milton

We use it and are quite happy. Both for workload distribution and security purposes (the distributed pollers are placed in firewalled networks but are allowed to report back to the motherbrain).

How many devices per poller depends on a lot of things and you will have to experiment. We have around 200 per poller and that keeps us in the 3-4 min range for a poller cycle.

Lars

From: observium <observium-bounces@observium.org> On Behalf Of Milton Ngan via observium
Sent: 18. juni 2022 02:45
To: Observium <observium@observium.org>
Cc: Milton Ngan <milton@valvesoftware.com>
Subject: [Observium] Distributed Polling

Are many people using Distributed Polling? We are just playing around with it now.

How many devices can you typically poll per thread in under 5 minutes? Some of this is also a function of how much latency you have between your pollers and your devices and how "slow" your devices are. But we are having to throw a lot threads (>200) at this to ensure we get our 1200+ devices polling under 5 minutes.

The RRD and DB backends are not the bottleneck, it is purely the latency of the SNMP polling we need to overcome. The longest devices take between 150-200s depending on what else is hitting it at the time (e.g discovery + poller). So we need a fair number threads just to get these out of the way.

I was wondering if sorting the most expensive devices first could help avoid getting two slow devices back to back.

Also, it would be nice to have the system use a queue to schedule the work rather than a hard partition. This avoids the problem of losing 1/Nth of your devices when you lose a poller. With a queue you would just lose 1/Nth of the threads. If you had enough spare capacity you could ride through the downtime of a single poller.

Cheer

Milton