Home Directory Plugins Hardware Storage Systems RAID Controllers check_cciss - HP and Compaq Smart Array Hardware status

check_cciss - HP and Compaq Smart Array Hardware status

Bookmark and Share

Rating
24 votes
Favoured:
4
Current Version
1.11
Last Release Date
2012-07-16
Compatible With
  • Nagios 1.x
  • Nagios 2.x
  • Nagios 3.x
License
GPL
Hits
71954
Files:
FileDescription
check_cciss-1.8check_cciss 2008/10/06 (v.1.8)
check_cciss-1.9check_cciss 2012/03/06 (v.1.9)
check_cciss-1.10check_cciss 2012/04/04 (v.1.10)
check_cciss-1.11check_cciss 2012/07/16 (v.1.11)
check_cciss - HP and Compaq Smart Array Hardware status
HP Smart Array Hardware status plugin for Nagios 1.x/2.x/3.x
This plugin checks hardware status for Smart Array Controllers, using the HP Array Configuration Utility CLI. (Array, controller, cache, disk, battery, etc...)

Examples:

./check_cciss -v
RAID OK: Smart Array 6i in Slot 0 array A logicaldrive 1 (67.8 GB, RAID 1+0, OK) (Controller Status: OK Cache Status: OK Battery Status: OK)

./check_cciss -v -p
RAID OK: Smart Array 6i in Slot 0 (Embedded) array A logicaldrive 1 (33.9 GB, RAID 1, OK)
physicaldrive 2:0 (port 2:id 0 , Parallel SCSI, 36.4 GB, OK)
physicaldrive 2:1 (port 2:id 1 , Parallel SCSI, 36.4 GB, OK)
physicaldrive 1:5 (port 1:id 5 , Parallel SCSI, 72.8 GB, OK, spare)
[Controller Status: OK Cache Status: OK Battery/Capacitor Status: OK]

./check_cciss
RAID OK

Another Examples:

RAID CRITICAL - HP Smart Array Failed: Smart Array 6i in Slot 0 array A (failed) logicaldrive 1 (67.8 GB, 1+0, Interim Recovery Mode)

RAID WARNING - HP Smart Array Rebuilding: Smart Array 6i in Slot 0 array A logicaldrive 1 (67.8 GB, 1+0, Rebuilding)
Reviews (14)
byleprasmurf, December 1, 2011
1 of 1 people found this review helpful
Has worked well for my purposes. However, if there's a firmware upgrade, the check fails with "RAID UNKNOWN - /usr/sbin/hpacucli did not execute properly : Error: The controller identified by "chassisname=a" was not detected."

The firmware update text is falsely matching the egrep's regex. I made the following change to line 215 and 217:

original
... | egrep -v "Slot" | ...

modified
... | egrep -v -e "Slot" -e "scenario" | ...
Owner's reply

Fixed! Thanks

byjisse44, June 29, 2011
1 of 1 people found this review helpful
Hi, very good plugin.

I just add lines to watch which physical drive is down or rebuilding, after line 210 of v1.8:

check2c=`sudo -u root $hpacucli controller slot=$slot physicaldrive all show 2>&1 | grep '\(Failed\|Rebuilding\)' | awk '{print $1, $2}'`
status=$?
if test ${status} -ne 0; then
echo "RAID UNKNOWN - $hpacucli did not execute properly : "${check2c}
exit $STATE_UNKNOWN
fi
check2="$check2$check2b -> /!\ $check2c"
Owner's reply

Fixed! Thanks

byPeter, April 17, 2012
0 of 1 people found this review helpful
On failure this returns "exit $STATE_CRITICAL"

But $STATE_CRITICAL is not defined, so the return status is always good. Only the Status Information text changes.
Owner's reply

Peter, the "$STATE_*" are definited into Nagios "utils.sh" (see include at line 134 of check_cciss-1.10)
The script work correctly from 2005 (v.1.0) with the same states!

The -p switch seems to not work. How can I fix this? I'd like to get the status of physical disks in the output. I've added -p to my command but still getting the same result like -v without the status of physical disks.

./check_cciss-1.11 -v
RAID CRITICAL - HP Smart Array Failed: Smart Array P400 in Slot 1 Controller Status: OK Cache Status: Temporarily Disabled Battery/Capacitor Status: Failed (Replace Batteries/Capacitors)

./check_cciss-1.11 -v -p
RAID CRITICAL - HP Smart Array Failed: Smart Array P400 in Slot 1 Controller Status: OK Cache Status: Temporarily Disabled Battery/Capacitor Status: Failed (Replace Batteries/Capacitors)
We have a few DL580 g5's with the raid controller in slot 11, and this check command doesn't work, it doesn't look for more then one character in your grep statement.

[nrpe@CCNETENGDB2] [/usr/lib64/nagios/plugins] > ./check_cciss
RAID UNKNOWN - /usr/sbin/hpacucli did not execute properly : Error: The controller identified by "slot=1" was not detected.

Line 256, I added a + to Slot \w to \w+. after adding this, the slot is properly identified and everything works fine!

# Get "Slot" & exclude slot needed
if [ "$EXCLUDE_SLOT" = "1" ]; then
slots=`echo ${check} | egrep -o "Slot \w+" | awk '{print $NF}' | grep -v "$excludeslot"`
else
slots=`echo ${check} | egrep -o "Slot \w+" | awk '{print $NF}'`
fi

--Joe
Great check. Works like a champ out of the box. I patched it to auto-detect the hpsa driver. We have a mix of cciss and hpsa.

--- check_cciss-1.11 2013-03-27 12:13:13.732582522 -0700
+++ check_cciss 2013-03-27 11:42:54.888555702 -0700
@@ -209,7 +209,7 @@
done

# Use HPSA driver (Hewlett Packard Smart Array)
-if [ "$HPSA" = "1" ]; then
+if [ "$HPSA" = "1" -o -d /sys/bus/pci/drivers/hpsa ]; then
COMPAQPROC="/proc/scsi/scsi"
fi
Please change following lines if your controller has slots in 2 digits:

if [ "$EXCLUDE_SLOT" = "1" ]; then

# slots=`echo ${check} | egrep -o "Slot \w" | awk '{print $NF}' | grep -v "$excludeslot"`
slots=`echo ${check} | egrep -o "Slot \w*" | awk '{print $NF}' | grep -v "$excludeslot"`
else

# slots=`echo ${check} | egrep -o "Slot \w" | awk '{print $NF}'`
slots=`echo ${check} | egrep -o "Slot \w*" | awk '{print $NF}'`
fi
byGldRush98, July 6, 2012
Worked "out of the box" on CentOS 5.8 with a Smart Array P400. Thanks a lot!
I can't get this to work on any of my P410i controllers. The output is the following:

username@Server:~$ /usr/lib/nagios/plugins/check_cciss -d
### Check if "HP Smart Array" (/proc/driver/cciss/cciss) is present >>>
cat: /proc/driver/cciss/cciss*: No such file or directory

### Check if "HP Smart Array" (/proc/driver/cpqarray/ida) is present >>>
cat: /proc/driver/cpqarray/ida*: No such file or directory

RAID UNKNOWN - HP Smart Array not found
Owner's reply

Hi! Try -s ...detect controller with HPSA (Hewlett Packard Smart Array)

byjgh2008, June 11, 2012
Why it can't work on my HPDL385G7 with CentOS6.2,when I run check_cciss command, it tell me "RAID UNKNOWN - HP Smart Array not found",who can help me ?
PS.the raid HW is P410i
Owner's reply

Hi! Try -s ...detect controller with HPSA (Hewlett Packard Smart Array)

I'm sorry I'm new at linux and nagios so this is probably a dumb question. Can someone point me in a direction as to how to use this plugin to check a server that is being monitor by my nagios server?
Owner's reply

Hi Andrew, the script is simple to setup. You can ask you question to Nagios Forum http://support.nagios.com/forum/ or similar forum/mailing list. Regards

Updated to check_cciss 1.9 (see www.monitoringexchange.org if not present here)

- Increased debug verbosity
- Added arguments to detect controller with HPSA driver (Hewlett Packard Smart Array) (-s)
- Recognize required firmware upgrades
- Don't confuse messages about a new fimrware with a chassis-error
- Check physical drives for predicted failures
- Added arguments to show detail for physical drives (-p)
- Check the state of the cache (a dead battery will turn the cache off)

Happy Nagios ;-)
Owner's reply

Updated to 1.10

Happy Nagios! :-)

byjbroome, December 6, 2011
If you adjust the grep that jisse44 suggests to Fail vs. the Failed he suggests you'll pick up a drive status of "Predictive Failure" as well.
Owner's reply

Fixed! Thanks

byjdecello, March 20, 2010
This one works out of the box. The check_hparray that is just like this one does not work with nagios3. The check_hparray.pl errors out on arrays of slot=0 (all of mine) and didn't give verbose output.

This one I just did a -v in nrpe.cfg, simple and detailed.... though it doesn't handle UNKNOWN quite right... was still green rather than other. not a big deal....
Owner's reply

Fixed! Thanks