There is no time frame to report; it didn't return a ping (in time) at the exact moment the message is logged.
These are the default ping config settings: #$config['ping']['retries'] = 3; // How many times to retry ping (1 - 10) #$config['ping']['timeout'] = 500; // Timeout in milliseconds (50 - 2000)
So, 3 missed-or->500ms pings, it would seem.
There are currently no device-specific ping settings, this is on my to-do somewhere.
You can try to enable this: $config['ping']['debug'] = TRUE; // If TRUE store ping errors into logs/debug.log file
and check the debug log.
Tom
On 22/12/2014 23:55, John Brown wrote:
Yes, I see the "Device Status changed to Down (PING)" in the log.
The conflict I have with this that it doesn't provide any more detailed information. How many pings failed, time frame, etc
I am running TCPDUMP on a monitor/span port that the ONOS is connected to and I see ICMP packets going out to devices and I see their reply packets come back.
Over a 15 minute period of time a host will be reported as DOWN, yet the ICMP packet flow shows echo_request / echo_reply pairs without undo delay.
Other machines on the same LAN subnet as the ONOS host also show no dropped ICMP packets.
Hence why I'm asking about additional debugging tools within ONOS..
Thanks
On Mon, Dec 22, 2014 at 3:39 PM, Tom Laermans <tom.laermans@powersource.cx mailto:tom.laermans@powersource.cx> wrote:
Observium... Bonitoring(?) does tell you why it's down. It doesn't receive a reply either over ICMP echo or over SNMP; this is noted in the event log when the host goes down. Tom On 22/12/2014 23:05, John Brown wrote:
Hi I'm trying to troubleshoot the many false positives we are receiving from OB. The system will report a host as down, yet our legacy Nagios and out-of-band Pingdom do not show the host as down. It doesn't appear that OB records in the log what specifically is making OB think the host is down. I've increased the SNMP time out value to 3 seconds (which seems very long) and that has helped with some hosts, mostly Mikrotiks. But I doubt that our Juniper MX480's (which are lightly loaded) should need such long time frame to respond. How can I get OB to report what is the actual trigger for its "Host Down" alerts ?? Are there tweaks for performance monitoring / testing ?? Thank you in advance.. _______________________________________________ observium mailing list observium@observium.org <mailto:observium@observium.org> http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
_______________________________________________ observium mailing list observium@observium.org <mailto:observium@observium.org> http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium