APC Smart-UPS SRT 6000 Alerting
Been working on getting this configured and my head around this alert. I've got it working to alert me of going on battery now, and if the batteries are marked as bad.
This is specifically tested with an APC Smart-UPS SRT 6000, SKU: SRT6KXLT.
I have three checks:
1: UPS on battery Test Condition: status_value eq 3 Association: Status Type in powernet-upsbasicoutput-state Device in $device
2: UPS battery needs replacing Test Condition: status_value ne 1 Association: Status Type in powernet-upsbatteryreplace-state Device in $device
3: UPS battery status Test Condition: status_value ne 2 Association: Status Type in powernet-upsbattery-state Device in $device
I hope this will be helpful as I didn't see much come up in my searching before.
I don't have any APC units to test with (we've long ago given up on them and moved to Eaton), but instead of looking for specific values we let the device decide if something is good or bad.
We have three generic alerts that cover all of our devices:
status_event equals alert sensor_value greater @sensor_limit AND @sensor_limit ne NULL sensor_value less @sensor_limit_low AND @sensor_limit_low ne NULL
This will throw an alert if any value is out of range high/low, or if a sensor itself is reporting not normal. Every UPS we've seen the output status and battery status go not-normal when input power is lost. Might be worth playing around with and see if you can simplify/genericize your alerting.
We do also have "sensor_value lt 20" with the association of "sensor.sensor_descr equals Battery Runtime Remaining" to alert us to potentially overloaded UPS'es, or batteries going bad that haven't tripped the UPS'es internal failure threshold.
________________________________________________________________________ Spencer J. Ryan | Manager, Technology and Infrastructure Miller Canfield T +1.313.496.7979 | F +1.313.496.7500 _________________________________________________________________________ -----Original Message----- From: Bryan Fields via observium observium@lists.observium.org Sent: Thursday, August 24, 2023 3:51 PM To: observium@lists.observium.org Cc: Bryan Fields Bryan@bryanfields.net Subject: [Observium] APC Smart-UPS SRT 6000 Alerting
Caution: This is an external email. Do not open attachments or click links from unknown or unexpected emails.
Been working on getting this configured and my head around this alert. I've got it working to alert me of going on battery now, and if the batteries are marked as bad.
This is specifically tested with an APC Smart-UPS SRT 6000, SKU: SRT6KXLT.
I have three checks:
1: UPS on battery Test Condition: status_value eq 3 Association: Status Type in powernet-upsbasicoutput-state Device in $device
2: UPS battery needs replacing Test Condition: status_value ne 1 Association: Status Type in powernet-upsbatteryreplace-state Device in $device
3: UPS battery status Test Condition: status_value ne 2 Association: Status Type in powernet-upsbattery-state Device in $device
I hope this will be helpful as I didn't see much come up in my searching before. -- Bryan Fields
727-409-1194 - Voice http://bryanfields.net _______________________________________________ observium mailing list -- observium@lists.observium.org To unsubscribe send an email to observium-leave@lists.observium.org
You have received a message from the law firm Miller Canfield. The information contained in or attached to this electronic mail may be privileged and/or confidential. If you received this transmission and are not the intended recipient, you should not read this message and are hereby notified that any dissemination, distribution or copying of this communication and/or its attachments is strictly prohibited. If you have received this communication in error or are not sure whether it is privileged, please immediately notify us by return e-mail and delete or destroy the original and any copies, electronic, paper or otherwise, that you may have of this communication and any attachments.
On 8/24/23 4:18 PM, Ryan, Spencer J. wrote:
I don't have any APC units to test with (we've long ago given up on them and moved to Eaton), but instead of looking for specific values we let the device decide if something is good or bad.
We have three generic alerts that cover all of our devices:
status_event equals alert sensor_value greater @sensor_limit AND @sensor_limit ne NULL sensor_value less @sensor_limit_low AND @sensor_limit_low ne NULL
+1 This is certainly going to be of help for others running this UPS's.
This will throw an alert if any value is out of range high/low, or if a sensor itself is reporting not normal. Every UPS we've seen the output status and battery status go not-normal when input power is lost. Might be worth playing around with and see if you can simplify/genericize your alerting.
I was trying something like this, but was unable to get it to work. The other issue I have is the UPS has it's internal batteries replaced with a new external pack I made from 16, 12v 50ah batteries. The unfortunate thing is this causes an alarm about being unable to calculate runtime when on float mode, thus just a normal alarm won't work
We do also have "sensor_value lt 20" with the association of "sensor.sensor_descr equals Battery Runtime Remaining" to alert us to potentially overloaded UPS'es, or batteries going bad that haven't tripped the UPS'es internal failure threshold.
Same issue as above. The newer APC UPS's have some serial interface between the batteries and there's not manual setting in the ups to set the number of packs anymore. What's worse is they have no intelligence in the charger, there's not an equalizer across the batteries in series, so it just cooks them, and at 3 years you have to replace them. Then the connectors they use require you to buy the replacements from the manufacturer at about 3x the normal price. It's a racket..
I've tried the Tripp-lite/Eaton units before but they do the same thing, and whats worse is if the batteries drain completely and the UPS shuts down, when power is restored it requires a person to power it on. They will not start up automatically.
It's really a shame no one makes a UPS that is designed for external batteries :(
Personally I'd suggest using both styles of alerts (for some at least where it makes sense) - mainly for the (albeit rare) cases where a device itself stops reporting sane limits, which I have seen a few times - though tbf not on eaton UPSes.
EG one (non UPS) example of this includes...
Aruba 2530-8-PoEP switch, reporting HP-ICF-POE-MIB-hpicfPoePethPsePortCurrent ( .1.3.6.1.4.1.11.2.14.11.1.9.1.1.1.1.X.X ) Reporting 0.12A in use, but warning threshold of 0.11A (Really this is 0.6 for POE 802.3at, or 0.35 for 802.3af)
Regards, James Tandy TandyUK Servers Limited
Tel: 01903 247 011 Www:http://www.tandyukservers.co.uk Email:support@tandyukservers.co.uk
TandyUK Servers Limited Registered in England and Wales, Company number 8314911 VAT Registered in the UK, number 182 0661 19 Registered Office: Amelia House, Crescent Road, Worthing, BN11 1QR
On 24/08/2023 22:19, Bryan Fields via observium wrote:
On 8/24/23 4:18 PM, Ryan, Spencer J. wrote:
I don't have any APC units to test with (we've long ago given up on them and moved to Eaton), but instead of looking for specific values we let the device decide if something is good or bad.
We have three generic alerts that cover all of our devices:
status_event equals alert sensor_value greater @sensor_limit AND @sensor_limit ne NULL sensor_value less @sensor_limit_low AND @sensor_limit_low ne NULL
+1 This is certainly going to be of help for others running this UPS's.
This will throw an alert if any value is out of range high/low, or if a sensor itself is reporting not normal. Every UPS we've seen the output status and battery status go not-normal when input power is lost. Might be worth playing around with and see if you can simplify/genericize your alerting.
I was trying something like this, but was unable to get it to work. The other issue I have is the UPS has it's internal batteries replaced with a new external pack I made from 16, 12v 50ah batteries. The unfortunate thing is this causes an alarm about being unable to calculate runtime when on float mode, thus just a normal alarm won't work
We do also have "sensor_value lt 20" with the association of "sensor.sensor_descr equals Battery Runtime Remaining" to alert us to potentially overloaded UPS'es, or batteries going bad that haven't tripped the UPS'es internal failure threshold.
Same issue as above. The newer APC UPS's have some serial interface between the batteries and there's not manual setting in the ups to set the number of packs anymore. What's worse is they have no intelligence in the charger, there's not an equalizer across the batteries in series, so it just cooks them, and at 3 years you have to replace them. Then the connectors they use require you to buy the replacements from the manufacturer at about 3x the normal price. It's a racket..
I've tried the Tripp-lite/Eaton units before but they do the same thing, and whats worse is if the batteries drain completely and the UPS shuts down, when power is restored it requires a person to power it on. They will not start up automatically.
It's really a shame no one makes a UPS that is designed for external batteries :(
participants (3)
-
Bryan Fields
-
James Tandy
-
Ryan, Spencer J.