I have Juniper SRX1500 firewalls for which Observium is sending false positive CPU alerts (example included below) based on feedback from Juniper.
More specifically, I was told that the OIDs used by Observium to collect CPU information are not the correct ones for this devices in particular.
Here are the OIDs that I was told should be used to monitor the data plane CPU usage (instead of the standard OIDs that poll the kernel):
.1.3.6.1.4.1.2636.3.39.1.12.1.1.1Or more specifically for forwarding plane flow (SPU) only:.1.3.6.1.4.1.2636.3.39.1.12.1.1.1.3"
Here is an Observium alert example:
Alert
|
Juniper Firewall CPU Usage is over 40%!
|
Entity
|
FPC: FEB @0/*/*
|
Conditions
|
processor_usage gt 40 (49)
|
Metrics
|
processor_usage = 49
|
Duration
|
4m 19s (2017-10-22 17:22:53)
|
Device
|
Device
|
host.mine
|
Hardware
|
SRX1500
|
Operating System
|
Juniper JunOS 15.1X49-D50.3 Internet Router
|
Location
|
lab
|
Uptime
|
27 days, 12h 3m 4s
|
The above alert is seen several times a day per device. I'm currently running version 17.10.8921 (Linux 2.6.32-642.13.1.el6.x86_64 [amd64]) but have seen this with different frequencies on all other versions I had.
Appreciate any information on potential fix or workaround or feedback on the Juniper response.
Thanks,
Al.