Device status goes up and down (ping check) without having actual problems
Hi,
Many of my devices goes Down/Up in Observium, while not having any actual problems. I monitor the same devices with Nagios (on the same server running Observium), so there's no connectivity problems between Observium and the monitored devices.
Is this a general problem with the Ping monitoring or just something thats happens to me? And is there any way to disable ping and rely on snmp for up/down status of devices?
I have attached a screenshot which shows some devices going down/up.
On Wed, 27 Mar 2013 13:54:21 +0100, Mark Nellemann mark.nellemann@gmail.com wrote:
Hi,
Many of my devices goes Down/Up in Observium, while not having any
actual
problems.
You mean except for the problem of Observium not getting a ping reply?
adam.
Mark,
We use fping to ping, if it's set as "down", it means fping didn't get a reply within its timeout setting... Maybe this needs to be raised or there is another issue?
Tom
On 27/03/2013 13:54, Mark Nellemann wrote:
Hi,
Many of my devices goes Down/Up in Observium, while not having any actual problems. I monitor the same devices with Nagios (on the same server running Observium), so there's no connectivity problems between Observium and the monitored devices.
Is this a general problem with the Ping monitoring or just something thats happens to me? And is there any way to disable ping and rely on snmp for up/down status of devices?
I have attached a screenshot which shows some devices going down/up.
-- Best Regards Mark Nellemann
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
Hi Tom,
Thank you for your suggestion regarding timeout.
I have added the following to my config.php, as I don't know of any other way to change the timeout. $config['fping'] = '/usr/bin/fping -t1000';
Time will tell if this solves my problem :) I will let you know.
Hi,
Just a me too. My install used to not report many up/down messages until recently. Not sure if this started with the change in isPingable not so long ago.
I'm planning to test with the old isPingable in the very near future to verify if this change is the cuase.
GRTNX, RobJE
On Wed, Mar 27, 2013 at 1:54 PM, Mark Nellemann mark.nellemann@gmail.com wrote:
Hi,
Many of my devices goes Down/Up in Observium, while not having any actual problems. I monitor the same devices with Nagios (on the same server running Observium), so there's no connectivity problems between Observium and the monitored devices.
Is this a general problem with the Ping monitoring or just something thats happens to me? And is there any way to disable ping and rely on snmp for up/down status of devices?
I have attached a screenshot which shows some devices going down/up.
-- Best Regards Mark Nellemann
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
I'm planning to test with the old isPingable in the very near future to verify if this change is the cuase.
I changed isPingable back to how it was implemented before r3728 and it it has been quiet since ... So quiet I verified isPingable really did what it was supposed to do, which it did. 1 host down/up in 1,5 hours instead of many.
Any hints / help how to debug this?
My oberservium install has 262 devices and 3281 ports and runs on a VM with Debian 6, 4 cores and 1,5G RAM all software is recent.
New function sets retries to 1 if it's not set in the config, the fping default is 3 retries.
This is probably the cause.
adam.
On Wed, 27 Mar 2013 16:16:04 +0100, "Rob J. Epping" rob.epping@gmail.com wrote:
I'm planning to test with the old isPingable in the very near future to verify if this change is the cuase.
I changed isPingable back to how it was implemented before r3728 and it it has been quiet since ... So quiet I verified isPingable really did what it was supposed to do, which it did. 1 host down/up in 1,5 hours instead of many.
Any hints / help how to debug this?
My oberservium install has 262 devices and 3281 ports and runs on a VM with Debian 6, 4 cores and 1,5G RAM all software is recent.
On Wed, Mar 27, 2013 at 3:21 PM, Adam Armstrong adama@memetic.org wrote:
New function sets retries to 1 if it's not set in the config, the fping default is 3 retries.
This is probably the cause.
Will revert and let you know.
adam.
GRTNX, RobJE
On Wed, 27 Mar 2013 16:16:04 +0100, "Rob J. Epping" rob.epping@gmail.com wrote:
I'm planning to test with the old isPingable in the very near future to verify if this change is the cuase.
I changed isPingable back to how it was implemented before r3728 and it it has been quiet since ... So quiet I verified isPingable really did what it was supposed to do, which it did. 1 host down/up in 1,5 hours instead of many.
Any hints / help how to debug this?
My oberservium install has 262 devices and 3281 ports and runs on a VM with Debian 6, 4 cores and 1,5G RAM all software is recent.
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
On Wed, Mar 27, 2013 at 4:25 PM, Rob J. Epping rob.epping@gmail.com wrote:
On Wed, Mar 27, 2013 at 3:21 PM, Adam Armstrong adama@memetic.org wrote:
New function sets retries to 1 if it's not set in the config, the fping default is 3 retries.
This is probably the cause.
Will revert and let you know.
The patch (defaulting to 3) works for me.
Thanks!
On 03/27/2013 07:16 PM, Rob J. Epping wrote:
I'm planning to test with the old isPingable in the very near future to verify if this change is the cuase.
I changed isPingable back to how it was implemented before r3728 and it it has been quiet since ... So quiet I verified isPingable really did what it was supposed to do, which it did. 1 host down/up in 1,5 hours instead of many.
Any hints / help how to debug this?
One ping commonly fails, it is recommended to use 3 ping. Use config options: // PING Settings - Retries/Timeouts #$config['ping_retries'] = 3; // How many times to retry ping #$config['ping_timeout'] = 500; // Timeout in milliseconds
For debug uncomment line 421 in file includes/functions.php (look for line "Uncomment this line for DEBUG isPingable") and then see errors in the file /tmp/pings_debug.log
Difference in function isPingable() before r3728 that it did not show the response time and now there are options ping retries and timeout.
I would appreciate if you send me output of your debug, if you have an ongoing problem with the frequent change of status. And tell if situation will improve if you set the option $config['ping_timeout'] = 1000 (or more).
My oberservium install has 262 devices and 3281 ports and runs on a VM with Debian 6, 4 cores and 1,5G RAM all software is recent.
On 28/03/2013 9:11, Mike Stupalov wrote:
One ping commonly fails, it is recommended to use 3 ping. Use config options: // PING Settings - Retries/Timeouts #$config['ping_retries'] = 3; // How many times to retry ping #$config['ping_timeout'] = 500; // Timeout in milliseconds
By the way, because I nagged on IRC, the settings were renamed to $config['ping']['retries'] and $config['ping']['timeout'] :-)
Tom
On Fri, 29 Mar 2013 00:48:13 +0100, Tom Laermans tom.laermans@powersource.cx wrote:
On 28/03/2013 9:11, Mike Stupalov wrote:
One ping commonly fails, it is recommended to use 3 ping. Use config options: // PING Settings - Retries/Timeouts #$config['ping_retries'] = 3; // How many times to retry ping #$config['ping_timeout'] = 500; // Timeout in milliseconds
By the way, because I nagged on IRC, the settings were renamed to $config['ping']['retries'] and $config['ping']['timeout'] :-)
are they in defaults.inc.php now?
adam.
Yes they are. Always were :)
On 29/03/2013 0:04, Adam Armstrong wrote:
On Fri, 29 Mar 2013 00:48:13 +0100, Tom Laermans tom.laermans@powersource.cx wrote:
On 28/03/2013 9:11, Mike Stupalov wrote:
One ping commonly fails, it is recommended to use 3 ping. Use config options: // PING Settings - Retries/Timeouts #$config['ping_retries'] = 3; // How many times to retry ping #$config['ping_timeout'] = 500; // Timeout in milliseconds
By the way, because I nagged on IRC, the settings were renamed to $config['ping']['retries'] and $config['ping']['timeout'] :-)
are they in defaults.inc.php now?
adam. _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
You sure? They weren't here, I had to change the default in the function from 1 to 3.
On Fri, 29 Mar 2013 07:40:01 +0100, Tom Laermans tom.laermans@powersource.cx wrote:
Yes they are. Always were :)
On 29/03/2013 0:04, Adam Armstrong wrote:
On Fri, 29 Mar 2013 00:48:13 +0100, Tom Laermans tom.laermans@powersource.cx wrote:
On 28/03/2013 9:11, Mike Stupalov wrote:
One ping commonly fails, it is recommended to use 3 ping. Use config options: // PING Settings - Retries/Timeouts #$config['ping_retries'] = 3; // How many times to retry ping #$config['ping_timeout'] = 500; // Timeout in milliseconds
By the way, because I nagged on IRC, the settings were renamed to $config['ping']['retries'] and $config['ping']['timeout'] :-)
are they in defaults.inc.php now?
adam. _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
Then I didn't understand the question.
On Fri, 2013-03-29 at 07:15 +0000, Adam Armstrong wrote:
You sure? They weren't here, I had to change the default in the function from 1 to 3.
On Fri, 29 Mar 2013 07:40:01 +0100, Tom Laermans tom.laermans@powersource.cx wrote:
Yes they are. Always were :)
On 29/03/2013 0:04, Adam Armstrong wrote:
On Fri, 29 Mar 2013 00:48:13 +0100, Tom Laermans tom.laermans@powersource.cx wrote:
On 28/03/2013 9:11, Mike Stupalov wrote:
One ping commonly fails, it is recommended to use 3 ping. Use config options: // PING Settings - Retries/Timeouts #$config['ping_retries'] = 3; // How many times to retry ping #$config['ping_timeout'] = 500; // Timeout in milliseconds
By the way, because I nagged on IRC, the settings were renamed to $config['ping']['retries'] and $config['ping']['timeout'] :-)
are they in defaults.inc.php now?
adam. _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
Mike Stupalov <landy2005@...> writes:
For debug uncomment line 421 in file includes/functions.php (look for line "Uncomment this line for DEBUG isPingable") and then see errors in the file /tmp/pings_debug.log
I'd set my ping time to 5 seconds (lots of cellular devices that need time to wake up) and couldn't figure out why I was still getting so many host up and down alerts.
I found in includes/functions a min/max on ping of 50ms and 2000ms, with a default of 500ms.
It'd be good to mention that min/max here: http://www.observium.org/wiki/Configuration_Options#Ping_Settings
participants (6)
-
Adam Armstrong
-
kiwibrew
-
Mark Nellemann
-
Mike Stupalov
-
Rob J. Epping
-
Tom Laermans