We have continued to have an issue where PERC controller status is flagged as failed on pretty much all our Poweredge Rx20, Rx30, Rx40, Rx50, Rx60 servers. We finally got to the bottom of it today.
Please can the following be corrected.
Root Cause: perccli64 requires an open stdin to function correctly. When it runs with stdin closed (<&-), it returns Status = Failure instead of querying the controller. The Observium agent script closes stdin as a security measure (exec <&- 2>/dev/null) before running all local scripts, so every time the agent ran our hdarray script via the socket, perccli silently failed. The Fix: Add </dev/null to all four perccli calls in the hdarray script so the binary gets a valid (empty) stdin rather than a closed file descriptor: $CLI show nolog </dev/null 2>/dev/null $CLI /c${ctrl} show nolog > $RAID 2>&1 </dev/null $CLI /c${ctrl}/vall show nolog > $RAID 2>&1 </dev/null $CLI /c${ctrl}/eall/sall show nolog > $RAID 2>&1 </dev/null What made this so hard to diagnose: The script worked perfectly in every direct test (setsid, bash -x, env -i, running as root via ssh) because all those methods leave stdin open. It only failed via the agent socket where stdin was closed - exactly replicating production conditions was key. Many thanks,
Chris