Hi Tom,
well it cannot really be that the device is marked as down by alert.
Test conditions are as follows:
device_status_type ping device_status equals 0
If I understand it correctly, even if snmp is not reachable the device should NOT be marked as down, because it is pingable. Therefore I do not understand why it is still marked as down.
Message: 5 Date: Wed, 2 Nov 2016 20:36:19 +0100 From: Tom Laermans tom.laermans@powersource.cx To: Observium Network Observation System observium@observium.org Subject: Re: [Observium] Device Down False positive and snmpget Message-ID: eff1c872-a32b-fe5f-ecb5-eafe35712ff4@powersource.cx Content-Type: text/plain; charset="utf-8"; Format="flowed" Sebastian, The device was likely down due to ping, first... then, as it became pingable but not snmp-able, there was no change logged in device status (down = down). As to why snmpget is telling you 2 is > 128, I'm not sure... Tom On 02/11/2016 20:05, Sebastian Klute wrote:
Hello folks,
today I got kinda strange problem. I got a Server up and running fine but it is still shown as device down. "Device status changed to Down (ping)" After some debugging and digging into the workflow of polling and checking egc. I tried to redo the different checks that could lead to a device down. First of all: Is it pingable ?
root@om:/tmp# /usr/bin/fping -t1000 -t 500 -c 20 -q 84.200.41.226 84.2xx.xxx.xxx : xmt/rcv/%loss = 20/20/0%, min/avg/max = 0.20/0.30/0.43
seems good to me - no idea why it should be down by ping. Is the hostname resolvable? Yes it is. Standart ping with hostname is working fine.
So I'll dig further into debug logs and found this
CMD[/usr/bin/snmpget -v2c -c *** -Pu -OQUst -m SNMPv2-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hostname.example':'161' sysObjectID.0 sysUpTime.0]
CMD EXITCODE[1] CMD RUNTIME[6.0283s] STDOUT[
] STDERR[ Timeout: No Response from udp:hostname.example:161. ] SNMP STATUS[FALSE]
I wondered why and did it manualy. The error is as followed:
Too many object identifiers specified. Only 128 allowed in one request.
The question is now, what can I do to solve this issue? What about the false positive device down by ping alert?
Additional Info:
> $data - /opt/observium/poller.php:170 /opt/observium/includes/alerts.inc.php:61 ========================================= array( [device_status] => string(1) "0" [device_status_type] => string(4) "snmp" [device_ping] => string(4) "7.34" [device_snmp] => int(0) )
Thanks for the help and best regards Sebastian Klute
-- Accelerated IT Services GmbH Kruppstraße 105 - 60388 Frankfurt - Germany sk@accelerated.de -http://www.accelerated.de/ Phone: +49 69 - 900 180 41 - Fax: +49 69 - 900 180 90
HRB: 60665 - Amtsgericht Ludwigshafen - VatID: DE253684415 Managing Directors: Nicolaj Kamensek & Ole Krieger
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
Sebastian,
What do you mean by "marked down" ? A device is always marked down by Observium if it either does not ping, or does not respond to SNMP.
This is separate from any alerting rules you specify; using the below configuration, if the device does not respond to SNMP, it will be marked down, and the configured checker will not alert.
Tom
On 11/03/2016 02:48 PM, Sebastian Klute wrote:
Hi Tom,
well it cannot really be that the device is marked as down by alert.
Test conditions are as follows:
device_status_type ping device_status equals 0
If I understand it correctly, even if snmp is not reachable the device should NOT be marked as down, because it is pingable. Therefore I do not understand why it is still marked as down.
Message: 5 Date: Wed, 2 Nov 2016 20:36:19 +0100 From: Tom Laermans tom.laermans@powersource.cx To: Observium Network Observation System observium@observium.org Subject: Re: [Observium] Device Down False positive and snmpget Message-ID: eff1c872-a32b-fe5f-ecb5-eafe35712ff4@powersource.cx Content-Type: text/plain; charset="utf-8"; Format="flowed" Sebastian, The device was likely down due to ping, first... then, as it became pingable but not snmp-able, there was no change logged in device status (down = down). As to why snmpget is telling you 2 is > 128, I'm not sure... Tom On 02/11/2016 20:05, Sebastian Klute wrote:
Hello folks,
today I got kinda strange problem. I got a Server up and running fine but it is still shown as device down. "Device status changed to Down (ping)" After some debugging and digging into the workflow of polling and checking egc. I tried to redo the different checks that could lead to a device down. First of all: Is it pingable ?
root@om:/tmp# /usr/bin/fping -t1000 -t 500 -c 20 -q 84.200.41.226 84.2xx.xxx.xxx : xmt/rcv/%loss = 20/20/0%, min/avg/max = 0.20/0.30/0.43
seems good to me - no idea why it should be down by ping. Is the hostname resolvable? Yes it is. Standart ping with hostname is working fine.
So I'll dig further into debug logs and found this
CMD[/usr/bin/snmpget -v2c -c *** -Pu -OQUst -m SNMPv2-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'hostname.example':'161' sysObjectID.0 sysUpTime.0]
CMD EXITCODE[1] CMD RUNTIME[6.0283s] STDOUT[
] STDERR[ Timeout: No Response from udp:hostname.example:161. ] SNMP STATUS[FALSE]
I wondered why and did it manualy. The error is as followed:
Too many object identifiers specified. Only 128 allowed in one request.
The question is now, what can I do to solve this issue? What about the false positive device down by ping alert?
Additional Info:
> $data - /opt/observium/poller.php:170 /opt/observium/includes/alerts.inc.php:61 ========================================= array( [device_status] => string(1) "0" [device_status_type] => string(4) "snmp" [device_ping] => string(4) "7.34" [device_snmp] => int(0) )
Thanks for the help and best regards Sebastian Klute
-- Accelerated IT Services GmbH Kruppstraße 105 - 60388 Frankfurt - Germany sk@accelerated.de -http://www.accelerated.de/ Phone: +49 69 - 900 180 41 - Fax: +49 69 - 900 180 90
HRB: 60665 - Amtsgericht Ludwigshafen - VatID: DE253684415 Managing Directors: Nicolaj Kamensek & Ole Krieger
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
-- Accelerated IT Services GmbH Kruppstraße 105 - 60388 Frankfurt - Germany sk@accelerated.de -http://www.accelerated.de/ Phone: +49 69 - 900 180 41 - Fax: +49 69 - 900 180 90
HRB: 60665 - Amtsgericht Ludwigshafen - VatID: DE253684415 Managing Directors: Nicolaj Kamensek & Ole Krieger
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
participants (2)
-
Sebastian Klute
-
Tom Laermans