Hi again,

 please check if issue fixed since r8033.

Note, device(s) rediscover required after update.
svn up
./discovery.php -h <device>
(or wait for next full all devices rediscover by cron)

Not sure about all online UPSes, if some UPSes still show (incorrect) alert event, send debug:
./discovery.php -d -m sensors -h <device>


On Thu, Aug 4, 2016 at 1:06 PM, Robert Williams <Robert@custodiandc.com> wrote:

Hi,

 

Ok, I get the poor discovery timing issue, understood. So, two ideas then:

 

1.       Provide a ‘knob’ for adjusting this per device (similar to the temperature and speed sensors) so you can set a value for OK, everything else is then considered ‘Failed’.

2.       During discovery, check for all the other sensors being OK. You can then be confident that the current inverter value is in it’s OK state.

 

Based on the output below, it would be a case of checking that none of the sensors are bad, then, you can set whatever the inverter status is to be OK.

 

 

Since the current method provides a 100% failure for all our units (and sketchy detection for everyone else) – I reckon that could well outweigh the risks with the discovery-based setting method.

 

However, a switch (with the same parameters as all the other sensors) such that you can set a ‘Custom Limit’ max/min for it, would be guaranteed to work in 100% of cases. Currently it doesn’t show on the list of sensors, because I guess it isn’t a ‘sensor’:

 

 

Adding to there would be a solution for everyone, but, of course, I have no idea how painful a solution that may be for you?!

 

Cheers :)

 

 

Robert Williams
Custodian Data Centre
Email: Robert@CustodianDC.com
http://www.CustodianDC.com

From: observium [mailto:observium-bounces@observium.org] On Behalf Of Tom Laermans
Sent: 04 August 2016 10:51


To: Observium Network Observation System <observium@observium.org>
Subject: Re: [Observium] Eaton 5130 'Inverter Off' Flipped

 

This is an annoying issue... :-) Yes is good on some, but not on others, No is good on some, but not on others.

Those chances are actually not that small. Not small enough to consider doing it that way.

- 05:57 Power goes out
- 06:00 Discovery runs, sees your inverter status, decides this should be the good version
- 07:15 UPS battery runs out, nobody was alerted because the status would be OK
- 08:12 Power comes back
- 08:15 You get an alert e-mail because your inverter is off
- 12:00 Discovery runs, sees your inverter status, decides this should be the good version
- 12:05 You get a recovery email

Our system is not designed to deal with this at all. Even if we differentiate out the models, they'll still be using the same MIB, and it's the MIB definition that says if something is good/bad/warn/ignore. :-/

Tom

On 04/08/2016 10:32, Robert Williams wrote:

Ahhh… yes that makes perfect sense thanks!

 

So yes the question now becomes – is there a way to set this in Observium?

 

Doing it by model would be painful, maybe you could set the state during a discovery to be the ‘good’ state? As the chances of a power fail being present at the exact moment of discovery are relatively small…

 

Robert Williams
Custodian Data Centre
Email: Robert@CustodianDC.com
http://www.CustodianDC.com

From: observium [mailto:observium-bounces@observium.org] On Behalf Of Gumilar Siegfried
Sent: 04 August 2016 09:21
To: Observium Network Observation System <observium@observium.org>
Subject: Re: [Observium] Eaton 5130 'Inverter Off' Flipped

 

Hi,

 

we are using some MGE UPS that are similar to your Eaton UPS and we have the same issue.

 

I recently talked to an Eaton Engineer and he explained to me that our UPS are line-interactive and therefore the inverter has to be off in normal operation.

For the bigger “online UPS” on the other hand the inverter has to be on.

 

It seems like Observium is configured to treat all the Eaton and MGE UPS as online UPS.

Maybe there is a way for Observium to detect whether it is an online or line-interactive UPS and handle the inverter Alert accordingly.

 

Regards,

Siegfried

 

Von: observium [mailto:observium-bounces@observium.org] Im Auftrag von Robert Williams
Gesendet: Donnerstag, 04. August 2016 09:58
An: Observium Network Observation System (observium@observium.org)
Betreff: [Observium] Eaton 5130 'Inverter Off' Flipped

 

Hi,

 

We’ve got several Eaton 5130 RT 3000 (with extended battery packs) being deployed to some remote locations. One of my guys has been setting them up overnight but has noticed that Observium reports one of the metrics incorrectly for all of the new units.

 

Ironically the inverter status is, well, inverted.

 

 

The platform is detected as:

 

Hardware                5130 RT 3000

Operating system        MGE UPS 6006AC (Firmware: JI)

 

A debug of the status module, shows:

 

SQL[UPDATE `status-state` set `status_value` ='2',`status_name` ='No',`status_event` ='ok',`status_last_change` ='1470278633',`status_polled` ='1470295701' WHERE `status_id` = '1289']

SQL RUNTIME[0.00013590s]

Checking (snmp) Inverter Off

 

CMD[/usr/bin/snmpget -v1 -c *** -Pu -OUqnv -m SNMPv2-MIB -M /opt/observium/mibs/rfc:/opt/observium/mibs/net-snmp 'udp':'ups-41.thn.oob.local':'161' .1.3.6.1.4.1.705.1.7.9.0]

 

CMD EXITCODE[0]

CMD RUNTIME[0.0414s]

STDOUT[

1

]

SNMP STATUS[TRUE]

 

RRD CMD[update /opt/observium/rrd/ups-41.thn.oob.local/status-mge-status-state-upsmgOutputInverterOff.1.0.rrd N:1 --daemon unix:/var/run/rrdcached.sock]

RRD RUNTIME[0.0011s]

RRD STDOUT[OK u:0.00 s:0.00 r:0.65]

RRD_STATUS[TRUE]

 

- /opt/observium/includes/polling/status.inc.php:45

/opt/observium/includes/alerts.inc.php:61

================================================

array(

  [status_value]       => int(1)

  [status_name]        => string(3) "Yes"

  [status_name_uptime] => int(17068)

  [status_event]       => string(5) "alert"

))

 

 

Reading the standards for that OID, that does seem to be the wrong way around.

 

After a simulated power failure, you can see it goes to ‘2’ when the power is failed and the inverter is (actually) off.

 

 

As a side note, I noticed that Observium is polling the OIDs for a Merlin Gerin unit (.1.3.6.1.4.1.705.x) for when the unit is actually Eaton.

 

Should it maybe be looking here instead?

 

iso.3.6.1.4.1.534.1.1.1.0 = STRING: "EATON"

iso.3.6.1.4.1.534.1.1.2.0 = STRING: "5130 RT 3000"

iso.3.6.1.4.1.534.1.1.3.0 = STRING: "INV: 6006AC"

iso.3.6.1.4.1.534.1.1.4.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.2.1.0 = INTEGER: 350

iso.3.6.1.4.1.534.1.2.2.0 = INTEGER: 78

iso.3.6.1.4.1.534.1.2.3.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.2.4.0 = INTEGER: 23

iso.3.6.1.4.1.534.1.2.5.0 = INTEGER: 1

iso.3.6.1.4.1.534.1.2.6.0 = ""

iso.3.6.1.4.1.534.1.3.1.0 = INTEGER: 490

iso.3.6.1.4.1.534.1.3.2.0 = Counter32: 0

iso.3.6.1.4.1.534.1.3.3.0 = INTEGER: 1

iso.3.6.1.4.1.534.1.3.4.1.1.1 = INTEGER: 1

iso.3.6.1.4.1.534.1.3.4.1.2.1 = INTEGER: 226

iso.3.6.1.4.1.534.1.3.4.1.3.1 = INTEGER: 0

iso.3.6.1.4.1.534.1.3.4.1.4.1 = INTEGER: 0

iso.3.6.1.4.1.534.1.3.4.1.5.1 = INTEGER: 1

iso.3.6.1.4.1.534.1.3.5.0 = INTEGER: 3

iso.3.6.1.4.1.534.1.4.1.0 = INTEGER: 67

iso.3.6.1.4.1.534.1.4.2.0 = INTEGER: 490

iso.3.6.1.4.1.534.1.4.3.0 = INTEGER: 1

iso.3.6.1.4.1.534.1.4.4.1.1.1 = INTEGER: 1

iso.3.6.1.4.1.534.1.4.4.1.2.1 = INTEGER: 226

iso.3.6.1.4.1.534.1.4.4.1.3.1 = INTEGER: 8

iso.3.6.1.4.1.534.1.4.4.1.4.1 = INTEGER: 1809

iso.3.6.1.4.1.534.1.4.4.1.5.1 = INTEGER: 1

iso.3.6.1.4.1.534.1.4.5.0 = INTEGER: 3

iso.3.6.1.4.1.534.1.4.6.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.4.7.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.4.8.0 = Counter32: 0

iso.3.6.1.4.1.534.1.5.1.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.5.2.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.6.1.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.6.2.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.6.3.0 = INTEGER: 40

iso.3.6.1.4.1.534.1.6.7.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.7.1.0 = Gauge32: 0

iso.3.6.1.4.1.534.1.7.18.0 = Gauge32: 0

iso.3.6.1.4.1.534.1.8.1.0 = INTEGER: 1

iso.3.6.1.4.1.534.1.8.2.0 = INTEGER: 2

iso.3.6.1.4.1.534.1.9.1.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.9.2.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.9.3.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.9.4.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.9.5.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.9.6.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.9.7.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.10.1.0 = INTEGER: 230

iso.3.6.1.4.1.534.1.10.2.0 = INTEGER: 230

iso.3.6.1.4.1.534.1.10.3.0 = INTEGER: 2700

iso.3.6.1.4.1.534.1.10.4.0 = INTEGER: 500

iso.3.6.1.4.1.534.1.10.5.0 = STRING: "08/04/2016 08:34:38"

iso.3.6.1.4.1.534.1.10.6.0 = INTEGER: 160

iso.3.6.1.4.1.534.1.10.7.0 = INTEGER: 294

iso.3.6.1.4.1.534.1.10.8.0 = ""

iso.3.6.1.4.1.534.1.11.1.0 = INTEGER: 5

iso.3.6.1.4.1.534.1.11.2.0 = INTEGER: 2

iso.3.6.1.4.1.534.1.11.3.0 = ""

iso.3.6.1.4.1.534.1.12.1.0 = INTEGER: 2

iso.3.6.1.4.1.534.1.12.2.1.1.1 = INTEGER: 1

iso.3.6.1.4.1.534.1.12.2.1.1.2 = INTEGER: 2

iso.3.6.1.4.1.534.1.12.2.1.2.1 = INTEGER: 1

iso.3.6.1.4.1.534.1.12.2.1.2.2 = INTEGER: 1

iso.3.6.1.4.1.534.1.12.2.1.3.1 = INTEGER: -1

iso.3.6.1.4.1.534.1.12.2.1.3.2 = INTEGER: -1

iso.3.6.1.4.1.534.1.12.2.1.4.1 = INTEGER: -1

iso.3.6.1.4.1.534.1.12.2.1.4.2 = INTEGER: -1

iso.3.6.1.4.1.534.1.12.2.1.5.1 = INTEGER: -1

iso.3.6.1.4.1.534.1.12.2.1.5.2 = INTEGER: -1

iso.3.6.1.4.1.534.1.12.2.1.6.1 = INTEGER: 3

iso.3.6.1.4.1.534.1.12.2.1.6.2 = INTEGER: 6

iso.3.6.1.4.1.534.1.12.2.1.7.1 = INTEGER: 0

iso.3.6.1.4.1.534.1.12.2.1.7.2 = INTEGER: 0

iso.3.6.1.4.1.534.1.12.2.1.8.1 = INTEGER: 0

iso.3.6.1.4.1.534.1.12.2.1.8.2 = INTEGER: 0

iso.3.6.1.4.1.534.1.12.2.1.9.1 = INTEGER: 0

iso.3.6.1.4.1.534.1.12.2.1.9.2 = INTEGER: 0

iso.3.6.1.4.1.534.1.12.2.1.10.1 = Counter32: 0

iso.3.6.1.4.1.534.1.12.2.1.10.2 = Counter32: 0

iso.3.6.1.4.1.534.1.13.1.0 = INTEGER: 32

iso.3.6.1.4.1.534.1.13.2.0 = INTEGER: 0

iso.3.6.1.4.1.534.1.13.3.0 = INTEGER: 0

 

I could be very wrong, maybe there is a valid reason to us the Merlin Gerin instead (I know they merged, but I thought I’d mention it in light of the odd issue).

 

Any ideas anyone? Cheers!

 

 

Robert Williams

Custodian Data Centre

 

 

 

 




_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

 


_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium




--
Mike Stupalov
http://observium.org/