Performance optimisation (incl. RAM disk vs. SSD)
![](https://secure.gravatar.com/avatar/1c685a39a957c5e4dd2544f4cdc48c02.jpg?s=120&d=mm&r=g)
Hi all,
We've got some money in the budget to upgrade our struggling monitoring server, and i'm trying to optimise Observium. Here are our Observium installation stats:
Total Up Down Ignored Disabled *Devices http://observium.buq.org.au/devices/* 153 http://observium.buq.org.au/devices/ 138 up http://observium.buq.org.au/devices/status=1/ 2 down http://observium.buq.org.au/devices/status=0/ 1 ignored http://observium.buq.org.au/devices/ignore=1/ 12 disabled http://observium.buq.org.au/devices/disabled=1/ *Ports http://observium.buq.org.au/ports/* 8896 http://observium.buq.org.au/ports/ 887 up http://observium.buq.org.au/ports/state=up/ 118 down http://observium.buq.org.au/ports/state=down/ 927 ignored http://observium.buq.org.au/ports/ignore=1/ 6837 shutdown http://observium.buq.org.au/ports/state=admindown/
We're currently running 4 cores, 12 GB RAM, 6 x 15K RPM 3.5" drives in RAID 10. We have about 20 GB of RRDs. Poller-wrapper.py runs with 32 workers and regularly goes over the 500-second mark. The web interface is quite sluggish despite php-xcache being installed.
I'm wondering if anyone has done testing as to which is the better approach for RRD storage performance: RAM disk or SSD. RAM seems likely to offer better IOPS, but managing the RAM disk is obviously more overhead, and i'm concerned that syncing the RAM disk elsewhere will have too much of a performance hit while it happens (especially during reboot cycles). RAM is $25 per GB, whereas SSD is $3 per GB (or $25 per GB if you use "write intensive" SSDs - presumably these are rated for a larger number of lifetime writes?).
My guess at a config for the new box: 12 cores, 64 GB RAM, 4 x 10K RPM 2.5" drives in RAID 10 for OS, 2 x SSD for /opt/observium/rrd. Any other recommendations about which hardware to put our money towards? Do we need to consider other issues like which type of file system to use for /opt/observium/rrd?
Thanks in advance, Paul
![](https://secure.gravatar.com/avatar/b3a546cd599e8024ed2790e548f4c63b.jpg?s=120&d=mm&r=g)
I haven't done any testing with it, but this article seems to imply some improvements from kernel adjustments to ensure rrd read/writes are in 'memory' until flushed.
http://code.google.com/p/epicnms/wiki/Scaling
It would be interesting to know which part of your existing system was your bottleneck ? Perhaps moving mySQL to its own node might move some of the load?
From: Paul Gear <observium@gear.dyndns.orgmailto:observium@gear.dyndns.org> Reply-To: Observium <observium@observium.orgmailto:observium@observium.org> Date: Friday, 9 August 2013 10:27 AM To: Observium <observium@observium.orgmailto:observium@observium.org> Subject: [Observium] Performance optimisation (incl. RAM disk vs. SSD)
Hi all,
We've got some money in the budget to upgrade our struggling monitoring server, and i'm trying to optimise Observium. Here are our Observium installation stats:
Total Up Down Ignored Disabled Deviceshttp://observium.buq.org.au/devices/ 153http://observium.buq.org.au/devices/ 138 uphttp://observium.buq.org.au/devices/status=1/ 2 downhttp://observium.buq.org.au/devices/status=0/ 1 ignoredhttp://observium.buq.org.au/devices/ignore=1/ 12 disabledhttp://observium.buq.org.au/devices/disabled=1/ Portshttp://observium.buq.org.au/ports/ 8896http://observium.buq.org.au/ports/ 887 up http://observium.buq.org.au/ports/state=up/ 118 down http://observium.buq.org.au/ports/state=down/ 927 ignored http://observium.buq.org.au/ports/ignore=1/ 6837 shutdownhttp://observium.buq.org.au/ports/state=admindown/
We're currently running 4 cores, 12 GB RAM, 6 x 15K RPM 3.5" drives in RAID 10. We have about 20 GB of RRDs. Poller-wrapper.py runs with 32 workers and regularly goes over the 500-second mark. The web interface is quite sluggish despite php-xcache being installed.
I'm wondering if anyone has done testing as to which is the better approach for RRD storage performance: RAM disk or SSD. RAM seems likely to offer better IOPS, but managing the RAM disk is obviously more overhead, and i'm concerned that syncing the RAM disk elsewhere will have too much of a performance hit while it happens (especially during reboot cycles). RAM is $25 per GB, whereas SSD is $3 per GB (or $25 per GB if you use "write intensive" SSDs - presumably these are rated for a larger number of lifetime writes?).
My guess at a config for the new box: 12 cores, 64 GB RAM, 4 x 10K RPM 2.5" drives in RAID 10 for OS, 2 x SSD for /opt/observium/rrd. Any other recommendations about which hardware to put our money towards? Do we need to consider other issues like which type of file system to use for /opt/observium/rrd?
Thanks in advance, Paul
![](https://secure.gravatar.com/avatar/1c685a39a957c5e4dd2544f4cdc48c02.jpg?s=120&d=mm&r=g)
Hi Peter,
I'm reasonably confident that the poor performance isn't MySQL-related, but i could be wrong. I have slow query logging turned on, and the only place it seems to be an issue is in the billing code. I managed to improve this by adding indexes to bill_data(timestamp) and bill_data(bill_id) (strangely enough the index already on bill_data(bill_id) as the primary key doesn't seem to be enough - not sure why; but the slow query log stopped logging anything as soon as i added those indexes).
I like their idea in that link about reducing the write-back time of dirty buffers - that makes use of the RAM without any extra management overhead. I'll do a bit of playing around with that on my current system.
Any other hints gratefully accepted.
Regards, Paul
On 08/09/2013 11:08 AM, Peter Childs wrote:
I haven't done any testing with it, but this article seems to imply some improvements from kernel adjustments to ensure rrd read/writes are in 'memory' until flushed.
http://code.google.com/p/epicnms/wiki/Scaling
It would be interesting to know which part of your existing system was your bottleneck ? Perhaps moving mySQL to its own node might move some of the load?
From: Paul Gear <observium@gear.dyndns.org mailto:observium@gear.dyndns.org> Reply-To: Observium <observium@observium.org mailto:observium@observium.org> Date: Friday, 9 August 2013 10:27 AM To: Observium <observium@observium.org mailto:observium@observium.org> Subject: [Observium] Performance optimisation (incl. RAM disk vs. SSD)
Hi all,
We've got some money in the budget to upgrade our struggling monitoring server, and i'm trying to optimise Observium. Here are our Observium installation stats:
Total Up Down Ignored Disabled *Devices http://observium.buq.org.au/devices/* 153 http://observium.buq.org.au/devices/ 138 up http://observium.buq.org.au/devices/status=1/ 2 down http://observium.buq.org.au/devices/status=0/ 1 ignored http://observium.buq.org.au/devices/ignore=1/ 12 disabled http://observium.buq.org.au/devices/disabled=1/ *Ports http://observium.buq.org.au/ports/* 8896 http://observium.buq.org.au/ports/ 887 up http://observium.buq.org.au/ports/state=up/ 118 down http://observium.buq.org.au/ports/state=down/ 927 ignored http://observium.buq.org.au/ports/ignore=1/ 6837 shutdown http://observium.buq.org.au/ports/state=admindown/
We're currently running 4 cores, 12 GB RAM, 6 x 15K RPM 3.5" drives in RAID 10. We have about 20 GB of RRDs. Poller-wrapper.py runs with 32 workers and regularly goes over the 500-second mark. The web interface is quite sluggish despite php-xcache being installed.
I'm wondering if anyone has done testing as to which is the better approach for RRD storage performance: RAM disk or SSD. RAM seems likely to offer better IOPS, but managing the RAM disk is obviously more overhead, and i'm concerned that syncing the RAM disk elsewhere will have too much of a performance hit while it happens (especially during reboot cycles). RAM is $25 per GB, whereas SSD is $3 per GB (or $25 per GB if you use "write intensive" SSDs - presumably these are rated for a larger number of lifetime writes?).
My guess at a config for the new box: 12 cores, 64 GB RAM, 4 x 10K RPM 2.5" drives in RAID 10 for OS, 2 x SSD for /opt/observium/rrd. Any other recommendations about which hardware to put our money towards? Do we need to consider other issues like which type of file system to use for /opt/observium/rrd?
Thanks in advance, Paul
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/b9a662d07b64a1778d5d170d5cd2b36b.jpg?s=120&d=mm&r=g)
Hey,
We made a highshot when we bought our machine for this purpose. We bought following: 2x Intel Xeon E5-2620, 6-Core, 2GHz 4x Samsung 16GB DDR3L ECC REG 1600MHz 8x INTEL SSD 520 120GB SATA 6Gb/s (RAID 6)
When we do a threading of 32 on our machine with about 300 machines the whole cronjob is done within 2 minutes.
/Peter
2013/8/9 Paul Gear observium@gear.dyndns.org
Hi all,
We've got some money in the budget to upgrade our struggling monitoring server, and i'm trying to optimise Observium. Here are our Observium installation stats:
Total Up Down Ignored Disabled *Deviceshttp://observium.buq.org.au/devices/
down http://observium.buq.org.au/devices/status=0/ 1 ignoredhttp://observium.buq.org.au/devices/ignore=1/ 12 disabled http://observium.buq.org.au/devices/disabled=1/ *Portshttp://observium.buq.org.au/ports/
- 8896 http://observium.buq.org.au/ports/ 887 up
http://observium.buq.org.au/ports/state=up/ 118 down http://observium.buq.org.au/ports/state=down/ 927 ignored http://observium.buq.org.au/ports/ignore=1/ 6837 shutdownhttp://observium.buq.org.au/ports/state=admindown/
We're currently running 4 cores, 12 GB RAM, 6 x 15K RPM 3.5" drives in RAID 10. We have about 20 GB of RRDs. Poller-wrapper.py runs with 32 workers and regularly goes over the 500-second mark. The web interface is quite sluggish despite php-xcache being installed.
I'm wondering if anyone has done testing as to which is the better approach for RRD storage performance: RAM disk or SSD. RAM seems likely to offer better IOPS, but managing the RAM disk is obviously more overhead, and i'm concerned that syncing the RAM disk elsewhere will have too much of a performance hit while it happens (especially during reboot cycles). RAM is $25 per GB, whereas SSD is $3 per GB (or $25 per GB if you use "write intensive" SSDs - presumably these are rated for a larger number of lifetime writes?).
My guess at a config for the new box: 12 cores, 64 GB RAM, 4 x 10K RPM 2.5" drives in RAID 10 for OS, 2 x SSD for /opt/observium/rrd. Any other recommendations about which hardware to put our money towards? Do we need to consider other issues like which type of file system to use for /opt/observium/rrd?
Thanks in advance, Paul
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/defdef53b588cb6b5f6b09e33764723a.jpg?s=120&d=mm&r=g)
Raid SSD seems to work best (for us)
We have no issues with close to 100 devices, and > 100k ports on a raid5 setup of SSD's (about 35GB of RRD's)
We also use Job's poller-wrapper.py. This has massively improved performance. Also , keep in mind that little tricks as 'noatime', XFS filesystem and mounting your ssd on /opt/observium/rrd instead of a symlink will help as well.
A lot of core's wont really help you with your GUI.
Our server: Dual Quadcore 2.13ghz, 6GB ram , 120GB raid5 SSD setup (I believe Intel SSD).
Maarten
From: Peter Persson <peter.persson@bredband2.semailto:peter.persson@bredband2.se> Reply-To: Observium Network Observation System <observium@observium.orgmailto:observium@observium.org> Date: Friday, August 9, 2013 10:23 AM To: Observium Network Observation System <observium@observium.orgmailto:observium@observium.org> Subject: Re: [Observium] Performance optimisation (incl. RAM disk vs. SSD)
Hey,
We made a highshot when we bought our machine for this purpose. We bought following: 2x Intel Xeon E5-2620, 6-Core, 2GHz 4x Samsung 16GB DDR3L ECC REG 1600MHz 8x INTEL SSD 520 120GB SATA 6Gb/s (RAID 6)
When we do a threading of 32 on our machine with about 300 machines the whole cronjob is done within 2 minutes.
/Peter
2013/8/9 Paul Gear <observium@gear.dyndns.orgmailto:observium@gear.dyndns.org> Hi all,
We've got some money in the budget to upgrade our struggling monitoring server, and i'm trying to optimise Observium. Here are our Observium installation stats:
Total Up Down Ignored Disabled Deviceshttp://observium.buq.org.au/devices/ 153http://observium.buq.org.au/devices/ 138 uphttp://observium.buq.org.au/devices/status=1/ 2 downhttp://observium.buq.org.au/devices/status=0/ 1 ignoredhttp://observium.buq.org.au/devices/ignore=1/ 12 disabledhttp://observium.buq.org.au/devices/disabled=1/ Portshttp://observium.buq.org.au/ports/ 8896http://observium.buq.org.au/ports/ 887 up http://observium.buq.org.au/ports/state=up/ 118 down http://observium.buq.org.au/ports/state=down/ 927 ignored http://observium.buq.org.au/ports/ignore=1/ 6837 shutdownhttp://observium.buq.org.au/ports/state=admindown/
We're currently running 4 cores, 12 GB RAM, 6 x 15K RPM 3.5" drives in RAID 10. We have about 20 GB of RRDs. Poller-wrapper.py runs with 32 workers and regularly goes over the 500-second mark. The web interface is quite sluggish despite php-xcache being installed.
I'm wondering if anyone has done testing as to which is the better approach for RRD storage performance: RAM disk or SSD. RAM seems likely to offer better IOPS, but managing the RAM disk is obviously more overhead, and i'm concerned that syncing the RAM disk elsewhere will have too much of a performance hit while it happens (especially during reboot cycles). RAM is $25 per GB, whereas SSD is $3 per GB (or $25 per GB if you use "write intensive" SSDs - presumably these are rated for a larger number of lifetime writes?).
My guess at a config for the new box: 12 cores, 64 GB RAM, 4 x 10K RPM 2.5" drives in RAID 10 for OS, 2 x SSD for /opt/observium/rrd. Any other recommendations about which hardware to put our money towards? Do we need to consider other issues like which type of file system to use for /opt/observium/rrd?
Thanks in advance, Paul
_______________________________________________ observium mailing list observium@observium.orgmailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
participants (4)
-
Moerman, Maarten
-
Paul Gear
-
Peter Childs
-
Peter Persson