SSDs are fairly resilient these days.
You'd buy a lot of SSD for 96GB of RAM at current prices. A RAID1 of two 512GB enterprise SSDs would be a good solution.
Using tmpfs was really a solution for not being able to scale magnetic disks (and for unreliability of early SSDs). I think it's of less use these days, and a bit of an unnecessary complexity if you can achieve the I/O throughout conventionally.
I personally don't see rrdcached as really worth the effort in most situations. If you can afford the I/O capacity, don't complicate things with rrdcached.
Adam.
Sent from BlueMailOn 11 Apr 2018, at 20:49, todevnull@free.fr wrote:
The thing is that an SSD won't appreciate such update every 5min :-/ (the lifetime would be impacted)
One sync per week will be acceptable for me. (RAM to disk). In the worse case i will lost one week of stats ... but there is 2 power supply in the server which is located in a DC. The risk sounds mitigated...
First i would like to test the benefit of rrdcached. Do we know if there is any 'counterpart'/cons coming with this tool ?
Thanks in advance.
Tarik.
11 avril 2018 20:58 "Adam Armstrong" <">adama@memetic.org > a écrit:Observium scales primarily on port count rather than device count, especially when it comes to I/O requirements. Our port RRDs are pretty big, so they have the biggest individual contribution to I/O.96GB is a lot of RAM, you might be better off with an SSD, since dumping all of that data out of RAM disk frequently for backup will be painful (too slow to disk, and pretty punishing to an SSD). :)adam.------ Original Message ------From: todevnull@free.frTo: "Observium" < observium@observium.org>Sent: 2018-04-11 12:14:01Subject: Re: [Observium] Performance issueHi Markus,
Thank you for your answer.
I played with the config and read carrefully the performance tuning guide on the website and see my conclusion :
If i poll all 844 devices (routers and switches), then i need 28min by default...
After tunning to 48 threads, it takes 17min ...
I disable polling on every switches (less critical than routers for me), it takes less than 60 sec for 350 routers !!!
I assume that switches need to write data in more rrd files ...
So, i'm now tuning the config and i will ask my boss to provide me 96GB of RAM for implement an RAM drive.
I will keep you informed about the result.
Tarik.
11 avril 2018 07:02 "Markus Klock" <">markus@best-practice.se > a écrit:Yeah you shuld really use SSD-storage when polling 800+ devices.Other than that, how many poller-wrapper threads have you configured? You can probably bump Them up to 48 with that many CPU cores in the server./MarkusDen tis 10 apr. 2018 22:23 < todevnull@free.fr> skrev:_______________________________________________with pictures in attached files ...
10 avril 2018 22:18 todevnull@free.fr a écrit:Hello all,
After a long period of test, i'm putting in production this great software in my company.
The server is quite good generaly in term of performance except for the disks ...
According to the fact that a polling task shouldn't exceed 5 min, i'm in some trouble ... :-/
A polling session (not the first one), needs 25min to be completed :_(
Except the fact that i should optimize my rrd storage, is there is some particular settings for such installation ?
Thanks in advance for your help.
Tarik.
2x intel xeon X5650 (total of 24 threads), 24GB of RAM, RAID 5 logical drive with 3x15k disks
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium