I believe this can be marked as resolved.
All of our devices were listed as offline this morning, which was caused by the following: http://www.observium.org/wiki/FAQs#All_my_hosts_seem_down_to_Observium_.2F_S...
I followed the guide, and removed .index from the noted directory.
As soon as I did that, everything started polling again. At around the same time, I also found this: http://www.observium.org/wiki/Poller-wrapper.py
Our poller was set to use 3 threads, so I bumped it up to 12. After review, I found that my alert checker was now listing devices as checking in and passing the offline test. I faked an outage on a test device, which triggered the e-mail alert.
Thanks again for the assistance on this. I'll have to dig into the snmp bug a little bit, but all is now working as it should.
Regards,
Nate Mellendorf | Network Analyst | Netgain 720 West Saint Germain Street | St. Cloud | MN | 56301 Phone: 320.251.4700 x183 | 877.797.4700 x183 www.NetgainHosting.comhttp://www.netgainhosting.com/
[cid:image002.png@01D01D29.BE602120] ________________________________ The information contained in this email message is for the designated recipient only and may be privileged, confidential, and protected from disclosure. If you have received this message in error, please notify the sender immediately and delete the original. Any dissemination, distribution, copying or other use of this message or any information contained within is strictly prohibited.
From: Nate Mellendorf Sent: Saturday, December 20, 2014 7:46 PM To: observium@observium.org Subject: RE: observium Digest, Vol 53, Issue 102
Tom,
Thank you very much for looking at my configuration. I failed to note earlier, that I receive the same results when I change the delay of this alert back down to 0.
For the sake of my own sanity though, I went back and changed it just now. I triggered a false device outage, which failed to trigger the alert.
Something odd I am seeing, is that Observium is now listing 8 devices as having passed the alert check, with 464 devices still listed as never checked.
As you can see from my screen shots earlier, that the alert checked never listed any devices as having been successfully checked. They were all greyed out.
As you've confirmed that the alert checker is configured correctly, I'm somewhat stumped.
I'll keep digging, as this is a feature I'd really like to get working.
Side Note:
Our CPU and RAM are not hitting 100% utilization on our host, so now I'm suspicious of we're just pushing the Observium platform a little too hard.
There are occasions when loading Observium takes longer than expected. When I create a new alert and regenerate them, the page seems to timeout.
(I've included the graphs of our disk I/O, as It's been suggested before as something to keep an eye on.)
Any other thoughts or insight on this would be greatly appreciated.
Thanks again for your time on both the project and on this thread.
[cid:image004.jpg@01D01D29.BE4F3130]
Regards,
Nate Mellendorf | NETWORK ANALYST | Netgain
720 West Saint Germain Street | St. Cloud | MN | 56301
Phone: 320.251.4700 x183 | 877.797.4700 x183
www.NetgainHosting.com
The information contained in this email message is for the designated recipient only and may be privileged, confidential, and protected from disclosure. If you have received this message in error, please notify the sender immediately and delete the original. Any dissemination, distribution, copying or other use of this message or any information contained within is strictly prohibited.
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of observium-request@observium.org Sent: Saturday, December 20, 2014 7:06 PM To: observium@observium.org Subject: observium Digest, Vol 53, Issue 102
Send observium mailing list submissions to
observium@observium.orgmailto:observium@observium.org
To subscribe or unsubscribe via the World Wide Web, visit
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
or, via email, send a message with subject or body 'help' to
observium-request@observium.orgmailto:observium-request@observium.org
You can reach the person managing the list at
observium-owner@observium.orgmailto:observium-owner@observium.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of observium digest..."
Today's Topics:
1. Re: FW: Observium mailing list (Tom Laermans)
----------------------------------------------------------------------
Message: 1
Date: Sun, 21 Dec 2014 02:06:09 +0100
From: Tom Laermans <tom.laermans@powersource.cxmailto:tom.laermans@powersource.cx>
To: Observium Network Observation System <observium@observium.orgmailto:observium@observium.org>
Subject: Re: [Observium] FW: Observium mailing list
Message-ID: <54961D01.3070002@powersource.cxmailto:54961D01.3070002@powersource.cx>
Content-Type: text/plain; charset="windows-1252"; Format="flowed"
Hi Nate,
You set the "delay" to 1 which means Observium will wait 1 extra poll to see if the alert condition restores itself (which it seems to do judging by your alert log - hence no alert).
Tom
On 18/12/2014 04:09, Nate Mellendorf wrote:
Good evening all,
After much trial and error, I?ve been unsuccessful in implementing a
simple up/down device alert within Observium.
As I searched around the web for a solution, It seemed as though this
hasn?t been much of an issue for most users.
Here?s some additional documentation I followed to try and fix it myself:
http://www.maartenmoerman.nl/?p=612(Nice work on this one)
https://www.mail-archive.com/observium@observium.org/msg05915.html
I?m starting think that my configuration file (config.php) could have
something to do with it. I?ve included it, along with multiple
snip-its to try and assist you in seeing my work.
If anyone has any suggestions on this, they would be greatly
appreciated. I?ve been working on this on and off for the past few
days now, and I?ve had no success in generating an offline alert.The
alert log shows the test as failed, but delayed. It never actually
send out an e-mail. My apologies in advance, if this turns out to be
user error.
Thanks in advance for any time put into this. If there is anything
else I can provide to further assist in troubleshooting, please let me
know.
I?d be more than happy to provide it.
*Omitted config.php:*
*?*
// Enable alerter
$config['poller-wrapper']['alerter']= TRUE;
// Disable page refresh
$config['page_refresh']= "300";## Refresh the page every xx seconds
// Set up a default alerter (email to a single address)
$config['email']['default']= "nate.mellendorf@netgainhosting.com
$config['alerts']['email']['from']= "observium@netgainhosting.com
$config['email']['default_only']= TRUE;
$config['email']['enable']= TRUE;// Disables OLD alerting system
$config['alerts']['interval']= 310;
*?*
/Sensitive information has been omitted./
*The alert I?ve configured:*(image)**
*Here is what I see when I?ve review the alert:*(image)**
*Here?s the alert log that?s generated:*(image)
(As you can see, the alert is triggered but delayed.)
*Finally, here is what?s found when I review the logs of an offline
device:*(image)**
Thanks again everyone,
*Nate Mellendorf **|****Network Analyst **|****Netgain**
*720 West Saint Germain Street | St. Cloud | MN | 56301
Phone: 320.251.4700 x183 *|* 877.797.4700 x183
*www.NetgainHosting.com* http://www.netgainhosting.com/**
--
The information contained in this email message is for the designated
recipient only and may be privileged, confidential, and protected from
disclosure. If you have received this message in error, please notify
the sender immediately and delete the original. Any dissemination,
distribution, copying or other use of this message or any information
contained within is strictly prohibited.
observium mailing list
observium@observium.orgmailto:observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium