Adam,
On Wed, Nov 7, 2012 at 11:03 AM, Adam Armstrong <adama@memetic.org> wrote:
I've never seen anything like this happen, and I've seen some pretty fucked up installations!On 07/11/2012 11:39, Chris Stone wrote:
On a few occasions graphs would have no data for some devices. Nuking all of the RRD files and starting that device over for data would 'fix' it, but I loose all historical data. Now I have a dozen or so Linux servers that I am getting blank graphs for. Running poller.php for the devices manually shows data being received and I note no errors. Nuking the RRD files does not fix it though. Might be able to dig in and, maybe, but I have just don't have the time to f*@!k with it - need something I can rely on - used to use Cacti and it was great but a pain setting up new devices. Observium is great and very easy to setup new devices, but this problem that keeps popping up causing historical data loss to fix - or as now, that does not even fix, is not something I can mess with anymore. I do like Observium overall and don't like having to switch away from it, but need the data and something reliable for ourselves and our clients....
What exactly happens? Do devices spontaneously lose all of the existing data, do they stop recording data after a while? Is it /every/ graph, or just some?
It appears to pull data - 'poller.php -d' shows data - lots of valid counters - see no problem there - e.g.:
RRD[update /opt/observium/rrd/hydra.mydomain.com/ucd_load.rrd N:5:2:0] including: includes/polling/ipSystemStats.inc.php
Polling IP-MIB ipSystemStats DEBUG: SNMP Auth options = -v2c -c k3fJzJAFeAtCyYKVFNxX
/usr/bin/snmpbulkwalk -v2c -c k3fJzJAFeAtCyYKVFNxX -OQUs -m IP-MIB -M /opt/observium/mibs udp:hydra.mydomain.com:161 ipSystemStats
ipSystemStatsInReceives.ipv4 = 2124263890
ipSystemStatsInReceives.ipv6 = 19818236
ipSystemStatsHCInReceives.ipv4 = 6419231186
ipSystemStatsHCInReceives.ipv6 = 19818236
ipSystemStatsInOctets.ipv6 = 3931117553
ipSystemStatsHCInOctets.ipv6 = 3931117553
ipSystemStatsInHdrErrors.ipv4 = 0
ipSystemStatsInHdrErrors.ipv6 = 0
ipSystemStatsInNoRoutes.ipv4 = 0
ipSystemStatsInNoRoutes.ipv6 = 0
ipSystemStatsInAddrErrors.ipv4 = 99390
ipSystemStatsInAddrErrors.ipv6 = 0
ipSystemStatsInUnknownProtos.ipv4 = 0
ipSystemStatsInUnknownProtos.ipv6 = 0
ipSystemStatsInTruncatedPkts.ipv4 = 0
ipSystemStatsInTruncatedPkts.ipv6 = 0
ipSystemStatsInForwDatagrams.ipv4 = 0
ipSystemStatsInForwDatagrams.ipv6 = 0
ipSystemStatsHCInForwDatagrams.ipv4 = 0
ipSystemStatsHCInForwDatagrams.ipv6 = 0
....
and the rrd file appears to get updated:
-rw-r--r-- 1 root root 371152 Oct 17 16:31 /opt/observium/rrd/hydra.mydomain.com/ucd_load.rrd
But the graphs show no data. They used to and then not. Some systems still show data, it's only a few and seems to be only some Linux servers - differing versions of CentOS.
And I am not even seeing any of my historical data - all graphs show 'nan' for the data including old data that it was showing at one point....
Note too that it seems to be just something with the RRD's since on the device overview page, the graph is blank, but the 'bars' on the right (Processors, Memory Pools and Storage) do so information correctly.
The only thing I can think of immediately to cause this would be if Observium somehow got a broken date far into the future, which caused rrd to refuse to accept any new data until that point. That would only break code which sends a specific date though, like the ports code.
Don't know - how would I check that and/or how would I recover from it without loosing historical data and keep it from happening again?
I'd really prefer to stay with Observium, but my time for this is very limited and .....
Chris
--
Chris Stone
AxisInternet, Inc.
www.axint.net
_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium