list,
I have been working all angles to scale my observium CE instance running well but seem to be failing. By failing I am referring to the classic gaps in graphs for all my devices. I am running on esxi with 16 cores @ 2.40GHz, 16G of ram, SSD SAS Raid and continue to have gaps. Most recently I have pushed sql to its own instance in hopes of working around potential IO issues but that didnt help.
What I do see is that my ram use is very high, I am guessing from the discovery process twice a day. When my ram use is lower throughout the day, I have no gaps. I am monitoring 6000ports on 190 devices. Should I be using north of 16G of ram in my env? I am happy to add more but all my reading indicates that observium is not RAM intense. My last attempt will be to move the RRDs to a ram disk but my RRDs are almost 16G at this time which I am guessing is part of the problem. Maybe it is time to clean house.
Any insight would be appreciated.
db