I haven't done any testing with it, but this article seems to imply some improvements from kernel adjustments to ensure rrd read/writes are in 'memory' until flushed.
http://code.google.com/p/epicnms/wiki/Scaling
It would be interesting to know which part of your existing system was your bottleneck ? Perhaps moving mySQL to its own node might move some of the load?
From: Paul Gear <observium@gear.dyndns.orgmailto:observium@gear.dyndns.org> Reply-To: Observium <observium@observium.orgmailto:observium@observium.org> Date: Friday, 9 August 2013 10:27 AM To: Observium <observium@observium.orgmailto:observium@observium.org> Subject: [Observium] Performance optimisation (incl. RAM disk vs. SSD)
Hi all,
We've got some money in the budget to upgrade our struggling monitoring server, and i'm trying to optimise Observium. Here are our Observium installation stats:
Total Up Down Ignored Disabled Deviceshttp://observium.buq.org.au/devices/ 153http://observium.buq.org.au/devices/ 138 uphttp://observium.buq.org.au/devices/status=1/ 2 downhttp://observium.buq.org.au/devices/status=0/ 1 ignoredhttp://observium.buq.org.au/devices/ignore=1/ 12 disabledhttp://observium.buq.org.au/devices/disabled=1/ Portshttp://observium.buq.org.au/ports/ 8896http://observium.buq.org.au/ports/ 887 up http://observium.buq.org.au/ports/state=up/ 118 down http://observium.buq.org.au/ports/state=down/ 927 ignored http://observium.buq.org.au/ports/ignore=1/ 6837 shutdownhttp://observium.buq.org.au/ports/state=admindown/
We're currently running 4 cores, 12 GB RAM, 6 x 15K RPM 3.5" drives in RAID 10. We have about 20 GB of RRDs. Poller-wrapper.py runs with 32 workers and regularly goes over the 500-second mark. The web interface is quite sluggish despite php-xcache being installed.
I'm wondering if anyone has done testing as to which is the better approach for RRD storage performance: RAM disk or SSD. RAM seems likely to offer better IOPS, but managing the RAM disk is obviously more overhead, and i'm concerned that syncing the RAM disk elsewhere will have too much of a performance hit while it happens (especially during reboot cycles). RAM is $25 per GB, whereas SSD is $3 per GB (or $25 per GB if you use "write intensive" SSDs - presumably these are rated for a larger number of lifetime writes?).
My guess at a config for the new box: 12 cores, 64 GB RAM, 4 x 10K RPM 2.5" drives in RAID 10 for OS, 2 x SSD for /opt/observium/rrd. Any other recommendations about which hardware to put our money towards? Do we need to consider other issues like which type of file system to use for /opt/observium/rrd?
Thanks in advance, Paul