In our case, it’s usually due to link flapping so the state changes frequently. It’s up now. In 2 minutes, or 2 hours, it’s down again for 2 minutes… it’s very unpredictable. If it’s our systems, we can deal with it but most of the time it’s a WAN link from our upstream terrestrial ISPs. Sometimes it is on us due to weather conditions or solar battery failures. 

On Dec 30, 2021, at 10:03 , David Melczer <dmelczer@greenbaumlaw.com> wrote:

What about changing the alert re-send interval?  This is in settings…alerting.
 
I have mine set to 2 hours so that I get the alert the something is down immediately, I get a recovery alert immediately, but my mail is not overwhelmed by constant alerting…every 2 hours I’ll get a reminder that things still are broken.  No need to turn off/on.
 
I hope this helps.
 
-Dave
 
David Z. Melczer  | Director of Information Technology
 
Greenbaum, Rowe, Smith & Davis LLP
Delivery: 99 Wood Avenue South | Iselin, NJ | 08830
Mailing: P.O. Box 5600 | Woodbridge, NJ | 07095
T: 732.476.3284  |  F: 732.476.3285  |  vCard
 
<image001.jpg>
 
 
From: observium <observium-bounces@observium.org> On Behalf Of Joey Stanford via observium
Sent: Thursday, December 30, 2021 11:13 AM
To: Observium <observium@observium.org>
Cc: Joey Stanford <nv0n@rmham.org>
Subject: [Observium] Alerting feature request
 

*** External Email Message ***

From my colleagues: 
 
 
What we would REALLY like is the ability to turn off alerts on a particular device for a specified amount of time.  For example, when Winter Park goes flakey but it is really Comcast or Centurylink that need to fix the problem, I don't want alerts every 10 minutes to tell me the link still sucks.

There are several possible solutions to this.  One is a manual "suppress alerts for X hours".  It is important that the alerts be turned back on after the specified time because otherwise we tend to forget to turn it back on.

Another option is a rate limit on alerts by device.  After 5 alerts I probably figured out the link sucks.

A third option is a "delay on recover".  Right now we get a recover as soon as one check passes.  If it is flakey we may want to have more checks pass before declaring it recovered.  This would probably be something we want to do by device, because from most I want to know as soon as it is back.
 


Disclaimer

This e-mail (including any attachments) is intended only for the exclusive use of the individual to whom it is addressed. The information contained hereinafter may be proprietary, confidential, privileged and exempt from disclosure under applicable law. If the reader of this e-mail is not the intended recipient or agent responsible for delivering the message to the intended recipient, the reader is hereby put on notice that any use, dissemination, distribution or copying of this communication is strictly prohibited. If the reader has received this communication in error, please immediately notify the sender by telephone (732-549-5600) or e-mail and delete all copies of this e-mail and any attachments. Thank you.