We’ve been running Observium on Ubnutu 12.04 for a little
over a year now, and after applying the latest community update,
we seem to have devices periodically reporting as down.
The error I’m getting is bouncing between UnpingableMySQL
and UnpingableRRD.
/poller.php -h itups03-02.net.internal
Observium v0.14.4.5229
Poller
Starting polling run:
SQL[SELECT `device_id` FROM `devices` WHERE `disabled` =
0 AND `hostname` LIKE 'itups03-02.net.internal' ORDER BY
`device_id` ASC]
SQL[SELECT * FROM `devices` WHERE `device_id` = '40']
SQL[SELECT * FROM devices_attribs WHERE `device_id` =
'40']
itups03-02.net.internal 40 apc
UnpingableRRD[cmd[update
/opt/observium/rrd/itups03-02.net.internal/status.rrd N:0]
stdout[OK u:0.00 s:0.00 r:0.05]
stderr[]]
RRD[cmd[update
/opt/observium/rrd/itups03-02.net.internal/ping.rrd N:U]
stdout[OK u:0.00 s:0.00 r:0.06]
stderr[]]
RRD[cmd[update
/opt/observium/rrd/itups03-02.net.internal/ping_snmp.rrd
N:U]
stdout[OK u:0.00 s:0.00 r:0.06]
stderr[]]
SQL[INSERT INTO `perf_times`
(`type`,`doing`,`start`,`duration`,`devices`) VALUES
('poll','itups03-02.net.internal','1401246542.664','0.066','1')]
./poller.php itups03-02.net.internal May 27, 2014, 20:09
- 1 devices polled in 0.066 secs
MySQL: Cell[0/0s] Row[1/0s] Rows[1/0s] Column[0/0s]
Update[0/0s] Insert[1/0s] Delete[0/0s]
Any advice on where we can look? Seems like the only way
to get them back up is to keep running the poller for the
particular host until the poller runs.
Thanks!
Andrew Davis
IT Systems
J. David Gladstone Institutes