Don't use a SAN. Observium is the perfect storm of worst use-case for SANs. It has lots of tiny writes all over the disk and Observium will eat up the performance of your SAN far quicker than its sticker price might indicate. You're far better off with a few SSDs or even a RAM disks, if you can fit it in.

The ports page doesn't use as much RAM as it once did, so that requirement isn't there anymore. Mostly what you need to do is keep up I/O throughput and CPU throughput to handle enough parallel threads to poll all of your devices quickly enough.

I would aim to run without rrdcached, and only look at using it if you need to. It adds additional CPU and latency to the equation, which is not usually desired.

One of the major problems of modern servers, IMO, is that the single-core clock speeds are relatively slow. For web-ui performance, you want the fastest single core speed you can get. For poller performance, you want as many cores as you can efficiently spread your poller load over. 4,000 devices might require more than 12 cores, especially if they're only 2Ghz cores.

Don't try to run Observium on a VM. The VM I/O overhead is a pain, and you'll ruin the host system for any other application. You want a high-core, high-memory, high-io dedicated server.

Something like :

http://www.ebay.co.uk/itm/Refurbished-HP-ProLiant-DL585-G2-Web-Server-4-x-Quad-Core-16-Core-128GB-RAM-/121427534352

Put a couple of SSD in that and it /should/ suffice. Though, you might want faster cores, and you might want 256GB of RAM, so you can keep the RRDs in RAM.

It's difficult to gauge performance requirements on that scale because it depends upon how the devices behave and what's monitor(able/ed) on them.

Oh, and split MySQL off onto a separate server with fewer, faster cores. It's not worth doing this with the web gui because of the latency involved in dealing with RRDs over the network, but it's definitely worth doing with MySQL.

------ Original Message ------

From: "Morten Guldager" <morten.guldager@gmail.com>

To: "Observium Network Observation System" <observium@observium.org>

Sent: 11/11/2014 2:11:05 PM

Subject: Re: [Observium] Performance

Yeah I read that page too. But I'm uncertain how linear observium scales. 10 cores will be doable, but how much RAM will it take then. Guess I will have to use rrdcached to keep the disk IOs on a manageable level. The server guys will probably complain if I suck every available IO ops out of their SAN.

/Morten

On Tue, Nov 11, 2014 at 4:22 PM, Spencer Gaw <spencerg@frii.net> wrote:

I'm not sure how current this information is but it may answer some of your questions: http://www.observium.org/wiki/Hardware_Scaling

Regards,

SG

On 11/11/2014 5:55 AM, Morten Guldager wrote:

'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.