Hi,

A solution for a large network like yours could be using a distributed architecture.

Here is a feedback from our usage :

Current architecture in our organisation graphs 2000 devices and 70k ports and target is 6000 devices.

We have built a farm of Observium pollers ( 6 physical servers ) and 1 more for mysql server (+Apache for GUI).

Our 6 poller servers are 70% CPU idle and disks are ok for now. We progressively are adding new devices.

We did a small patch in mysql scheme, to add a "poller_id" into devices table.

At each new device added, script "add_device.php" choose the poller the least used (based on numbers of devices per poller server).

Each poller asks to SQL Server devices with its poller_id.

The GUI server can read via NFS all rrd files stored in pollers.

Hardware used for each poller is HP DL360 Gen7 / 4 GB RAM / 6 disks SAS 10k Raid 1+0 / 2*E5620 2.40Ghz (4 cores each).

Here is a diagram : http://imagizer.imageshack.us/a/img538/3225/iMXiYb.png

By the way, may it be a good idea to add this distributed feature for monitoring of large platforms (adding the poller_id field into devices table) ?

Cheers,

Antoine Desir