I get the same thing about the interfaces nulling on some of my HP 1910 switches.

On Mar 24, 2015 10:02 PM, "Christopher Pole" <chris@apexn.com.au> wrote:

Poor choice of terms on my part regarding discovery, let me try again.

Every now and then Observium will go through and mark every interface attribute on this device as NULL and then shortly thereafter re-learn the attributes, for example:

Time: 2015-03-25 00:56:26
Port: lag2
Message: Interface changed: [ifName] LAG2 -> NULL; [ifAlias] Trunk: sw01 -> NULL; [ifAdminStatus] up -> NULL; [ifOperStatus] up -> NULL; [ifMtu] 9216 -> NULL; [ifSpeed] 2000000000 -> NULL; [ifHighSpeed] 2000 -> NULL; [ifType] ieee8023adLag -> NULL; [ifPhysAddress] 001bed9fa841 -> NULL; [ifPromiscuousMode] true -> NULL; [ifConnectorPresent] false -> NULL

At this point the alert triggers for this interface, as anything over 0bps is triggers the >85% util alarm.

Shortly thereafter observium relearns the interface attributes:

Time: 2015-03-25 01:00:16
Port: lag2
Message: Interface changed: [ifName] -> LAG2; [ifAlias] -> Trunk: sw01; [ifAdminStatus] -> up; [ifOperStatus] -> up; [ifMtu] -> 9216; [ifSpeed] -> 2000000000; [ifHighSpeed] -> 2000; [ifType] -> ieee8023adLag; [ifPhysAddress] -> 001bed9fa841; [ifPromiscuousMode] -> true; [ifConnectorPresent] -> false

At this point the alert sends a recovery message.

The nullification and re-learning of interface attributes happens periodically but with no fixed schedule. Is this possibly something that can be caused by a bulk-walk failure part way through? Or an SNMP timeout?

Thanks,
Chris

On Wed, Mar 25, 2015 at 10:41 AM, Adam Armstrong <adama@memetic.org> wrote:
Nothing gets rediscovered at any frequency other than the frequency with which you run discovery.php.

Almost everything collected by the poller is collected by snmpbulkwalk.

It's possible that bulkwalk is failing part way through for this device causing the ifHighSpeed to be set to zero (or set to the value of ifSpeed, which will be just as useless on ports >1Gb)

adam.

On 24/03/2015 22:16:43, Christopher Pole <chris@apexn.com.au> wrote:
Hi folks,

we have an alert that triggers when certain ports hit >85% utilisation being triggered by a single device on a regular basis as it seems to get rewalked quite often, which briefly sets the speed of the monitored ports to 0 which then triggers the alert as any utilisation over 0 is >85% :).

The device is a Brocade CES, but it is the only one in our fleet that suffers from this rediscover > trigger alert issue. Though the others also seem to rediscover more often than is necessary.

Any thoughts on how I might figure out what is going on so I can fix it?

Thanks,
Chris

_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium