Hi,
A solution for a large network like yours could be using a distributed architecture.
Here is a feedback from our usage :
Current architecture in our organisation graphs 2000 devices and 70k ports and target is 6000 devices.
We have built a farm of Observium pollers ( 6 physical servers ) and 1 more for mysql server (+Apache for GUI).
Our 6 poller servers are 70% CPU idle and disks are ok for now. We progressively are adding new devices.
We did a small patch in mysql scheme, to add a "poller_id" into devices table.
At each new device added, script "add_device.php" choose the poller the least used (based on numbers of devices per poller server).
Each poller asks to SQL Server devices with its poller_id.
The GUI server can read via NFS all rrd files stored in pollers.
Hardware used for each poller is HP DL360 Gen7 / 4 GB RAM / 6 disks SAS 10k Raid 1+0 / 2*E5620 2.40Ghz (4 cores each).
Here is a diagram : http://imagizer.imageshack.us/a/img538/3225/iMXiYb.png
By the way, may it be a good idea to add this distributed feature for monitoring of large platforms (adding the poller_id field into devices table)
?
Cheers,
Antoine Desir