Hi,

 

Sorry I may have come across poorly. I was not saying the group method is terrible. I have set it up, and it works.

 

And we definitely do not want to stop polling sensors just because an interface is disabled J

 

I just meant that some ‘states’ are associative and explicitly implied, such as; a shutdown port should never be able to generate alerts as it is admin-down etc. It was more of a philosophical question and is confusing to operators to require logic to check things that should be implied by the state.

 

But under the hood I can appreciate that the wild-west that are the OID trees, makes this hard..

 

Things like these generally require a bit of thinking about and design, quick solutions just create horrible unsupportable situations we can never roll back” – Adam I have never agreed with you more :D haha

 

Note that the root of all of this is that for most devices, sensors are not directly linked to ports in SNMP. In many cases we're having to do text-based matching to try to guess what port a sensor belongs to. This is relatively recent, too, and all of the alerting and other sensors infrastructure was created before we could match sensors to ports.” – Great, thank you, makes perfect sense.

 

Thanks chaps. Appreciate your thoughts.

Andy.

 

From: Adam Armstrong via observium <observium@observium.org>
Sent: 28 March 2019 14:55
To: Luis Balbinot via observium <observium@observium.org>
Cc: Adam Armstrong <adama@memetic.org>
Subject: Re: [Observium] Low dbm sensor alerts for shutdown interfaces

 

This is something of a sledgehammer solution to the problem. Masking from a group is a better method for time time being, IMO.

 

I'm not sure what the ideal solution would be. We do not want to remove sensors just because a port is disabled. Historical data is still useful in this scenario.

 

Things like these generally require a bit of thinking about and design, quick solutions just create horrible unsupportable situations we can never roll back, which result in people being confused about why things happen or why things disappear and reappear.

 

I think a possible solution would be to add the ability to expose the parent entity's status to the sensor entity, perhaps setting the state to something other than up/down or by having an extra attribute/metric. I'm not sure how best to do that without thinking about it though. 

 

Certainly we do not recommend patching things with quick fixes when there are workarounds that can be done using the group system (which is sort of why i created it, to make more complex alerting things possible).

 

Note that the root of all of this is that for most devices, sensors are not directly linked to ports in SNMP. In many cases we're having to do text-based matching to try to guess what port a sensor belongs to. This is relatively recent, too, and all of the alerting and other sensors infrastructure was created before we could match sensors to ports.

 

And as mike says, anything we do in this area would also need to take into account that a sensor may also be a storage device, or a cpu, or something else.

 

adam.

On 2019-03-28 14:34:09, Luis Balbinot via observium <observium@observium.org> wrote:

I made it clear it was a quick and dirty solution, didn't I? It's not meant to be included in the official code.

 

Your solution works but I'd rather have a global toggle to ignore sensors for shutdown interfaces.

 

Luis

 

On Thu, Mar 28, 2019 at 11:20 AM Mike Stupalov via observium <observium@observium.org> wrote:

Andrew,

 (I'm not Adam, but)
I do not really understand what your problem is now?
I was added absolutely working solution, how to exclude Shutdown ports for this sensors.

Ignore sensors _by default_ based on associated entity (port) status
is complete incorrect way.

You all mostly using DOM sensors (with port as measured entity),
but we have (can add more in the future) other measured entities,
where this logic also not required and not correct for alerting.

Andrew Lemin wrote on 28/03/2019 15:55:

Hi guys,

 

Adam what is your take on this based on current code quality?

 

I appreciate it may be challenging extrapolating the foreign keys/relationships between entities, but that is never a good reason to not do something that is appropriate and reasonable?

Fudging checkers with complex logic is dangerous in general, and prone to errors, increasing chances of missed alerts..

 

Will try this workaround for now.

 

Thanks for your time,

Kind regards Andy.

 

From: Mike Stupalov via observium <observium@observium.org>
Sent: 24 March 2019 08:29
To: Observium <observium@observium.org>
Cc: Mike Stupalov <mike@observium.org>
Subject: Re: [Observium] Low dbm sensor alerts for shutdown interfaces

 

Hi,

 if you use latest pro release, that feature already added there:

1. You can create any port group, for example with exclude all shutdown ports:



2. now you can add this group to sensor/status entity alert group as "Sensor Measured Port Group":





Adam Ward via observium wrote on 22/03/2019 19:17:

+1 for please fixing this

 

I really like the ability to create alert checkers that cover everything, creating more exceptions makes alerting more difficult to maintain.

 

Also, Adam, I just purchased an enterprise license, we’re moving forward with observium for our prod environment. Great tool, good name, Adams must think alike 😝

 

 

Adam Ward

Systems Engineer

Shamrock Trading Corporation

Office Phone/Fax: (913) 310-2247

Email: award@rtsfinancial.com

Website: www.shamrocktradingcorp.com

 

From: Richard Savage <richard@zananet.com>
Organization: zanaNET Ltd
Date: Friday, March 22, 2019 at 11:15 AM
To: Observium <observium@observium.org>
Cc: Adam Ward <award@shamrocktradingcorp.com>
Subject: Re: [Observium] Low dbm sensor alerts for shutdown interfaces

 

Hi All

Yes I agree this is an issue too.  I was trying to get around it by creating a port group of all ports that were in a non-admin down state and set the alerts to use that, but I cant see to find a way to add a port group to the alert sensor.

Can this please be fixed?

Thanks

Richard

 

On 22/03/2019 16:05, Adam Ward via observium wrote:

Andrew-

 

I noticed this too, it was like if the SFP was inserted, it’d still throw an error because the transceiver tx/rx limits were technically still out of spec. There needs to be logic to check for the interface status first, then disable the thresholds/alerts if down.

 

I had just considered removing optics for ports that were shut, but kinda annoying for sure.

 

 

Adam Ward

Systems Engineer

Shamrock Trading Corporation

Office Phone/Fax: (913) 310-2247

Email: award@rtsfinancial.com

Website: www.shamrocktradingcorp.com

 

From: observium <observium-bounces@observium.org> on behalf of Andrew Lemin via observium <observium@observium.org>
Reply-To: Observium <observium@observium.org>
Date: Friday, March 22, 2019 at 10:46 AM
To: Andrew Lemin via observium <observium@observium.org>
Cc: Andrew Lemin <AndrewL@4d-dc.com>
Subject: [Observium] Low dbm sensor alerts for shutdown interfaces

 

Hi,

 

Has anyone else seen the issue where after enabling an alert checker for transceiver optics with sensor_value greater @sensor_limit and sensor_value less @sensor_limit_low, alerts are still received for shutdown interfaces?

 

For example, on a cisco switch when running ‘show interfaces transceiver detail’, we can see;

           Optical            High Alarm  High Warn  Low Warn   Low Alarm

           Receive Power      Threshold   Threshold  Threshold  Threshold

Port       (dBm)              (dBm)       (dBm)      (dBm)      (dBm)

Te1/1/25   -40.0                 1.9        -1.0        -9.9      -13.9

Te2/1/1    -40.0                 1.9        -1.0        -9.9      -13.9

Etc..

 

Te1/1/25                       admin down     down

Te2/1/1                        admin down     down

Etc

 

The sensors page shows the correct thresholds etc, but shows the interface in Red (and hence is generating alerts through the checker), rather than grey (as the port is shutdown and has no fibre connected – hence -40dbm.).

cid:image001.png@01D4E0C6.4F698990

 

The same port on the Ports page does show the interface as being grey/shutdown

cid:image002.png@01D4E0C6.4F698990

 

Have no idea how to resolve this and stop getting transceiver alerts for shutdown interfaces? This sounds like a potential bug?

 

Thanks, Andy.

This e-mail communication (including any attachments) is intended only for use by the individual or entity named above and is considered confidential. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you should immediately stop reading this message and delete it from your system. Any unauthorized reading, distribution, copying or other use of this communication (or its attachments) is strictly prohibited.

_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium



This e-mail communication (including any attachments) is intended only for use by the individual or entity named above and is considered confidential. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you should immediately stop reading this message and delete it from your system. Any unauthorized reading, distribution, copying or other use of this communication (or its attachments) is strictly prohibited.

_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

 

--
Mike Stupalov
Observium Limited, http://observium.org

 

--
Mike Stupalov
Observium Limited, http://observium.org

_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium