From my colleagues:
What we would REALLY like is the ability to turn off alerts on a particular device for a specified amount of time. For example, when Winter Park goes flakey but it is really Comcast or Centurylink that need to fix the problem, I don't want alerts every 10 minutes to tell me the link still sucks.
There are several possible solutions to this. One is a manual "suppress alerts for X hours". It is important that the alerts be turned back on after the specified time because otherwise we tend to forget to turn it back on.
Another option is a rate limit on alerts by device. After 5 alerts I probably figured out the link sucks.
A third option is a "delay on recover". Right now we get a recover as soon as one check passes. If it is flakey we may want to have more checks pass before declaring it recovered. This would probably be something we want to do by device, because from most I want to know as soon as it is back.
What about changing the alert re-send interval? This is in settings…alerting.
I have mine set to 2 hours so that I get the alert the something is down immediately, I get a recovery alert immediately, but my mail is not overwhelmed by constant alerting…every 2 hours I’ll get a reminder that things still are broken. No need to turn off/on.
I hope this helps.
-Dave
David Z. Melczer | Director of Information Technology
Greenbaum, Rowe, Smith & Davis LLP Delivery: 99 Wood Avenue South | Iselin, NJ | 08830 Mailing: P.O. Box 5600 | Woodbridge, NJ | 07095 T: 732.476.3284 | F: 732.476.3285 | vCardhttp://www.greenbaumlaw.com/vcard-1999.vcf
[cid:image001.jpg@01D7FD75.4D7F2C60] greenbaumlaw.comhttp://www.greenbaumlaw.com/ [cid:image002.png@01D7FD75.4D7F2C60]https://www.linkedin.com/company/greenbaum-rowe-smith-&-davis-llp?trk=top_nav_home [cid:image003.png@01D7FD75.4D7F2C60]https://twitter.com/greenbaumlaw [cid:image004.png@01D7FD75.4D7F2C60]https://www.facebook.com/greenbaumlaw?fref=ts&ref=br_tf
From: observium observium-bounces@observium.org On Behalf Of Joey Stanford via observium Sent: Thursday, December 30, 2021 11:13 AM To: Observium observium@observium.org Cc: Joey Stanford nv0n@rmham.org Subject: [Observium] Alerting feature request
*** External Email Message *** From my colleagues:
What we would REALLY like is the ability to turn off alerts on a particular device for a specified amount of time. For example, when Winter Park goes flakey but it is really Comcast or Centurylink that need to fix the problem, I don't want alerts every 10 minutes to tell me the link still sucks.
There are several possible solutions to this. One is a manual "suppress alerts for X hours". It is important that the alerts be turned back on after the specified time because otherwise we tend to forget to turn it back on.
Another option is a rate limit on alerts by device. After 5 alerts I probably figured out the link sucks.
A third option is a "delay on recover". Right now we get a recover as soon as one check passes. If it is flakey we may want to have more checks pass before declaring it recovered. This would probably be something we want to do by device, because from most I want to know as soon as it is back.
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.
In our case, it’s usually due to link flapping so the state changes frequently. It’s up now. In 2 minutes, or 2 hours, it’s down again for 2 minutes… it’s very unpredictable. If it’s our systems, we can deal with it but most of the time it’s a WAN link from our upstream terrestrial ISPs. Sometimes it is on us due to weather conditions or solar battery failures.
On Dec 30, 2021, at 10:03 , David Melczer dmelczer@greenbaumlaw.com wrote:
What about changing the alert re-send interval? This is in settings…alerting.
I have mine set to 2 hours so that I get the alert the something is down immediately, I get a recovery alert immediately, but my mail is not overwhelmed by constant alerting…every 2 hours I’ll get a reminder that things still are broken. No need to turn off/on.
I hope this helps.
-Dave
David Z. Melczer | Director of Information Technology
Greenbaum, Rowe, Smith & Davis LLP Delivery: 99 Wood Avenue South | Iselin, NJ | 08830 Mailing: P.O. Box 5600 | Woodbridge, NJ | 07095 T: 732.476.3284 | F: 732.476.3285 | vCard http://www.greenbaumlaw.com/vcard-1999.vcf
<image001.jpg> greenbaumlaw.com http://www.greenbaumlaw.com/ <image002.png> https://www.linkedin.com/company/greenbaum-rowe-smith-&-davis-llp?trk=top_nav_home <image003.png> https://twitter.com/greenbaumlaw <image004.png> https://www.facebook.com/greenbaumlaw?fref=ts&ref=br_tf
From: observium <observium-bounces@observium.org mailto:observium-bounces@observium.org> On Behalf Of Joey Stanford via observium Sent: Thursday, December 30, 2021 11:13 AM To: Observium <observium@observium.org mailto:observium@observium.org> Cc: Joey Stanford <nv0n@rmham.org mailto:nv0n@rmham.org> Subject: [Observium] Alerting feature request
*** External Email Message ***
From my colleagues:
What we would REALLY like is the ability to turn off alerts on a particular device for a specified amount of time. For example, when Winter Park goes flakey but it is really Comcast or Centurylink that need to fix the problem, I don't want alerts every 10 minutes to tell me the link still sucks.
There are several possible solutions to this. One is a manual "suppress alerts for X hours". It is important that the alerts be turned back on after the specified time because otherwise we tend to forget to turn it back on.
Another option is a rate limit on alerts by device. After 5 alerts I probably figured out the link sucks.
A third option is a "delay on recover". Right now we get a recover as soon as one check passes. If it is flakey we may want to have more checks pass before declaring it recovered. This would probably be something we want to do by device, because from most I want to know as soon as it is back.
Disclaimer
This e-mail (including any attachments) is intended only for the exclusive use of the individual to whom it is addressed. The information contained hereinafter may be proprietary, confidential, privileged and exempt from disclosure under applicable law. If the reader of this e-mail is not the intended recipient or agent responsible for delivering the message to the intended recipient, the reader is hereby put on notice that any use, dissemination, distribution or copying of this communication is strictly prohibited. If the reader has received this communication in error, please immediately notify the sender by telephone (732-549-5600) or e-mail and delete all copies of this e-mail and any attachments. Thank you.
I would agree with this request…most alerting programs has the concept of “maintenance windows” so I can proactively disable alerts when I am patching or doing some other planned maintenance.
That would be nice
Tony
From: Joey Stanford via observiummailto:observium@observium.org Sent: Thursday, December 30, 2021 12:00 PM To: Observiummailto:observium@observium.org Cc: Joey Stanfordmailto:nv0n@rmham.org Subject: [Observium] Alerting feature request
From my colleagues:
What we would REALLY like is the ability to turn off alerts on a particular device for a specified amount of time. For example, when Winter Park goes flakey but it is really Comcast or Centurylink that need to fix the problem, I don't want alerts every 10 minutes to tell me the link still sucks.
There are several possible solutions to this. One is a manual "suppress alerts for X hours". It is important that the alerts be turned back on after the specified time because otherwise we tend to forget to turn it back on.
Another option is a rate limit on alerts by device. After 5 alerts I probably figured out the link sucks.
A third option is a "delay on recover". Right now we get a recover as soon as one check passes. If it is flakey we may want to have more checks pass before declaring it recovered. This would probably be something we want to do by device, because from most I want to know as soon as it is back.
I think the third option seems like a good idea. Similar to the idea that you need X bad poll periods before alerting, you also want X good polls before it is considered ok. So basically it will hold the alert state until things are stable.
That will deal with the flappy alert problem pretty well.
________________________________ From: observium observium-bounces@observium.org on behalf of Joey Stanford via observium observium@observium.org Sent: Thursday, December 30, 2021 8:13:02 AM To: Observium Cc: Joey Stanford Subject: [Observium] Alerting feature request
From my colleagues:
What we would REALLY like is the ability to turn off alerts on a particular device for a specified amount of time. For example, when Winter Park goes flakey but it is really Comcast or Centurylink that need to fix the problem, I don't want alerts every 10 minutes to tell me the link still sucks.
There are several possible solutions to this. One is a manual "suppress alerts for X hours". It is important that the alerts be turned back on after the specified time because otherwise we tend to forget to turn it back on.
Another option is a rate limit on alerts by device. After 5 alerts I probably figured out the link sucks.
A third option is a "delay on recover". Right now we get a recover as soon as one check passes. If it is flakey we may want to have more checks pass before declaring it recovered. This would probably be something we want to do by device, because from most I want to know as soon as it is back.
participants (4)
-
David Melczer
-
Joey Stanford
-
Milton Ngan
-
Tony Guadagno