Search Exchange
Search All Sites
Login
check_cciss - HP and Compaq Smart Array Hardware status
Current Version
1.11
Last Release Date
2012-07-16
Compatible With
- Nagios 1.x
- Nagios 2.x
- Nagios 3.x
Owner
License
GPL
Hits
71954
Files:
| File | Description |
|---|---|
| check_cciss-1.8 | check_cciss 2008/10/06 (v.1.8) |
| check_cciss-1.9 | check_cciss 2012/03/06 (v.1.9) |
| check_cciss-1.10 | check_cciss 2012/04/04 (v.1.10) |
| check_cciss-1.11 | check_cciss 2012/07/16 (v.1.11) |
This plugin checks hardware status for Smart Array Controllers, using the HP Array Configuration Utility CLI. (Array, controller, cache, disk, battery, etc...)
Examples:
./check_cciss -v
RAID OK: Smart Array 6i in Slot 0 array A logicaldrive 1 (67.8 GB, RAID 1+0, OK) (Controller Status: OK Cache Status: OK Battery Status: OK)
./check_cciss -v -p
RAID OK: Smart Array 6i in Slot 0 (Embedded) array A logicaldrive 1 (33.9 GB, RAID 1, OK)
physicaldrive 2:0 (port 2:id 0 , Parallel SCSI, 36.4 GB, OK)
physicaldrive 2:1 (port 2:id 1 , Parallel SCSI, 36.4 GB, OK)
physicaldrive 1:5 (port 1:id 5 , Parallel SCSI, 72.8 GB, OK, spare)
[Controller Status: OK Cache Status: OK Battery/Capacitor Status: OK]
./check_cciss
RAID OK
Another Examples:
RAID CRITICAL - HP Smart Array Failed: Smart Array 6i in Slot 0 array A (failed) logicaldrive 1 (67.8 GB, 1+0, Interim Recovery Mode)
RAID WARNING - HP Smart Array Rebuilding: Smart Array 6i in Slot 0 array A logicaldrive 1 (67.8 GB, 1+0, Rebuilding)
Examples:
./check_cciss -v
RAID OK: Smart Array 6i in Slot 0 array A logicaldrive 1 (67.8 GB, RAID 1+0, OK) (Controller Status: OK Cache Status: OK Battery Status: OK)
./check_cciss -v -p
RAID OK: Smart Array 6i in Slot 0 (Embedded) array A logicaldrive 1 (33.9 GB, RAID 1, OK)
physicaldrive 2:0 (port 2:id 0 , Parallel SCSI, 36.4 GB, OK)
physicaldrive 2:1 (port 2:id 1 , Parallel SCSI, 36.4 GB, OK)
physicaldrive 1:5 (port 1:id 5 , Parallel SCSI, 72.8 GB, OK, spare)
[Controller Status: OK Cache Status: OK Battery/Capacitor Status: OK]
./check_cciss
RAID OK
Another Examples:
RAID CRITICAL - HP Smart Array Failed: Smart Array 6i in Slot 0 array A (failed) logicaldrive 1 (67.8 GB, 1+0, Interim Recovery Mode)
RAID WARNING - HP Smart Array Rebuilding: Smart Array 6i in Slot 0 array A logicaldrive 1 (67.8 GB, 1+0, Rebuilding)
Reviews (14)
Has worked well for my purposes. However, if there's a firmware upgrade, the check fails with "RAID UNKNOWN - /usr/sbin/hpacucli did not execute properly : Error: The controller identified by "chassisname=a" was not detected."
The firmware update text is falsely matching the egrep's regex. I made the following change to line 215 and 217:
original
... | egrep -v "Slot" | ...
modified
... | egrep -v -e "Slot" -e "scenario" | ...
The firmware update text is falsely matching the egrep's regex. I made the following change to line 215 and 217:
original
... | egrep -v "Slot" | ...
modified
... | egrep -v -e "Slot" -e "scenario" | ...
Owner's reply
Fixed! Thanks
Hi, very good plugin.
I just add lines to watch which physical drive is down or rebuilding, after line 210 of v1.8:
check2c=`sudo -u root $hpacucli controller slot=$slot physicaldrive all show 2>&1 | grep '\(Failed\|Rebuilding\)' | awk '{print $1, $2}'`
status=$?
if test ${status} -ne 0; then
echo "RAID UNKNOWN - $hpacucli did not execute properly : "${check2c}
exit $STATE_UNKNOWN
fi
check2="$check2$check2b -> /!\ $check2c"
I just add lines to watch which physical drive is down or rebuilding, after line 210 of v1.8:
check2c=`sudo -u root $hpacucli controller slot=$slot physicaldrive all show 2>&1 | grep '\(Failed\|Rebuilding\)' | awk '{print $1, $2}'`
status=$?
if test ${status} -ne 0; then
echo "RAID UNKNOWN - $hpacucli did not execute properly : "${check2c}
exit $STATE_UNKNOWN
fi
check2="$check2$check2b -> /!\ $check2c"
Owner's reply
Fixed! Thanks
On failure this returns "exit $STATE_CRITICAL"
But $STATE_CRITICAL is not defined, so the return status is always good. Only the Status Information text changes.
But $STATE_CRITICAL is not defined, so the return status is always good. Only the Status Information text changes.
Owner's reply
Peter, the "$STATE_*" are definited into Nagios "utils.sh" (see include at line 134 of check_cciss-1.10)
The script work correctly from 2005 (v.1.0) with the same states!
byPatrick, May 15, 2013
The -p switch seems to not work. How can I fix this? I'd like to get the status of physical disks in the output. I've added -p to my command but still getting the same result like -v without the status of physical disks.
./check_cciss-1.11 -v
RAID CRITICAL - HP Smart Array Failed: Smart Array P400 in Slot 1 Controller Status: OK Cache Status: Temporarily Disabled Battery/Capacitor Status: Failed (Replace Batteries/Capacitors)
./check_cciss-1.11 -v -p
RAID CRITICAL - HP Smart Array Failed: Smart Array P400 in Slot 1 Controller Status: OK Cache Status: Temporarily Disabled Battery/Capacitor Status: Failed (Replace Batteries/Capacitors)
./check_cciss-1.11 -v
RAID CRITICAL - HP Smart Array Failed: Smart Array P400 in Slot 1 Controller Status: OK Cache Status: Temporarily Disabled Battery/Capacitor Status: Failed (Replace Batteries/Capacitors)
./check_cciss-1.11 -v -p
RAID CRITICAL - HP Smart Array Failed: Smart Array P400 in Slot 1 Controller Status: OK Cache Status: Temporarily Disabled Battery/Capacitor Status: Failed (Replace Batteries/Capacitors)
bygodish, April 17, 2013
We have a few DL580 g5's with the raid controller in slot 11, and this check command doesn't work, it doesn't look for more then one character in your grep statement.
[nrpe@CCNETENGDB2] [/usr/lib64/nagios/plugins] > ./check_cciss
RAID UNKNOWN - /usr/sbin/hpacucli did not execute properly : Error: The controller identified by "slot=1" was not detected.
Line 256, I added a + to Slot \w to \w+. after adding this, the slot is properly identified and everything works fine!
# Get "Slot" & exclude slot needed
if [ "$EXCLUDE_SLOT" = "1" ]; then
slots=`echo ${check} | egrep -o "Slot \w+" | awk '{print $NF}' | grep -v "$excludeslot"`
else
slots=`echo ${check} | egrep -o "Slot \w+" | awk '{print $NF}'`
fi
--Joe
[nrpe@CCNETENGDB2] [/usr/lib64/nagios/plugins] > ./check_cciss
RAID UNKNOWN - /usr/sbin/hpacucli did not execute properly : Error: The controller identified by "slot=1" was not detected.
Line 256, I added a + to Slot \w to \w+. after adding this, the slot is properly identified and everything works fine!
# Get "Slot" & exclude slot needed
if [ "$EXCLUDE_SLOT" = "1" ]; then
slots=`echo ${check} | egrep -o "Slot \w+" | awk '{print $NF}' | grep -v "$excludeslot"`
else
slots=`echo ${check} | egrep -o "Slot \w+" | awk '{print $NF}'`
fi
--Joe
byj.mccanta@f5.com, March 27, 2013
Great check. Works like a champ out of the box. I patched it to auto-detect the hpsa driver. We have a mix of cciss and hpsa.
--- check_cciss-1.11 2013-03-27 12:13:13.732582522 -0700
+++ check_cciss 2013-03-27 11:42:54.888555702 -0700
@@ -209,7 +209,7 @@
done
# Use HPSA driver (Hewlett Packard Smart Array)
-if [ "$HPSA" = "1" ]; then
+if [ "$HPSA" = "1" -o -d /sys/bus/pci/drivers/hpsa ]; then
COMPAQPROC="/proc/scsi/scsi"
fi
--- check_cciss-1.11 2013-03-27 12:13:13.732582522 -0700
+++ check_cciss 2013-03-27 11:42:54.888555702 -0700
@@ -209,7 +209,7 @@
done
# Use HPSA driver (Hewlett Packard Smart Array)
-if [ "$HPSA" = "1" ]; then
+if [ "$HPSA" = "1" -o -d /sys/bus/pci/drivers/hpsa ]; then
COMPAQPROC="/proc/scsi/scsi"
fi
bynityanaths, November 28, 2012
Please change following lines if your controller has slots in 2 digits:
if [ "$EXCLUDE_SLOT" = "1" ]; then
# slots=`echo ${check} | egrep -o "Slot \w" | awk '{print $NF}' | grep -v "$excludeslot"`
slots=`echo ${check} | egrep -o "Slot \w*" | awk '{print $NF}' | grep -v "$excludeslot"`
else
# slots=`echo ${check} | egrep -o "Slot \w" | awk '{print $NF}'`
slots=`echo ${check} | egrep -o "Slot \w*" | awk '{print $NF}'`
fi
if [ "$EXCLUDE_SLOT" = "1" ]; then
# slots=`echo ${check} | egrep -o "Slot \w" | awk '{print $NF}' | grep -v "$excludeslot"`
slots=`echo ${check} | egrep -o "Slot \w*" | awk '{print $NF}' | grep -v "$excludeslot"`
else
# slots=`echo ${check} | egrep -o "Slot \w" | awk '{print $NF}'`
slots=`echo ${check} | egrep -o "Slot \w*" | awk '{print $NF}'`
fi
byGldRush98, July 6, 2012
Worked "out of the box" on CentOS 5.8 with a Smart Array P400. Thanks a lot!
bysparkey, June 16, 2012
I can't get this to work on any of my P410i controllers. The output is the following:
username@Server:~$ /usr/lib/nagios/plugins/check_cciss -d
### Check if "HP Smart Array" (/proc/driver/cciss/cciss) is present >>>
cat: /proc/driver/cciss/cciss*: No such file or directory
### Check if "HP Smart Array" (/proc/driver/cpqarray/ida) is present >>>
cat: /proc/driver/cpqarray/ida*: No such file or directory
RAID UNKNOWN - HP Smart Array not found
username@Server:~$ /usr/lib/nagios/plugins/check_cciss -d
### Check if "HP Smart Array" (/proc/driver/cciss/cciss) is present >>>
cat: /proc/driver/cciss/cciss*: No such file or directory
### Check if "HP Smart Array" (/proc/driver/cpqarray/ida) is present >>>
cat: /proc/driver/cpqarray/ida*: No such file or directory
RAID UNKNOWN - HP Smart Array not found
Owner's reply
Hi! Try -s ...detect controller with HPSA (Hewlett Packard Smart Array)
byjgh2008, June 11, 2012
Why it can't work on my HPDL385G7 with CentOS6.2,when I run check_cciss command, it tell me "RAID UNKNOWN - HP Smart Array not found",who can help me ?
PS.the raid HW is P410i
PS.the raid HW is P410i
Owner's reply
Hi! Try -s ...detect controller with HPSA (Hewlett Packard Smart Array)
byAndrew, April 25, 2012
I'm sorry I'm new at linux and nagios so this is probably a dumb question. Can someone point me in a direction as to how to use this plugin to check a server that is being monitor by my nagios server?
Owner's reply
Hi Andrew, the script is simple to setup. You can ask you question to Nagios Forum http://support.nagios.com/forum/ or similar forum/mailing list. Regards
bysimonerosa, March 5, 2012
Updated to check_cciss 1.9 (see www.monitoringexchange.org if not present here)
- Increased debug verbosity
- Added arguments to detect controller with HPSA driver (Hewlett Packard Smart Array) (-s)
- Recognize required firmware upgrades
- Don't confuse messages about a new fimrware with a chassis-error
- Check physical drives for predicted failures
- Added arguments to show detail for physical drives (-p)
- Check the state of the cache (a dead battery will turn the cache off)
Happy Nagios ;-)
- Increased debug verbosity
- Added arguments to detect controller with HPSA driver (Hewlett Packard Smart Array) (-s)
- Recognize required firmware upgrades
- Don't confuse messages about a new fimrware with a chassis-error
- Check physical drives for predicted failures
- Added arguments to show detail for physical drives (-p)
- Check the state of the cache (a dead battery will turn the cache off)
Happy Nagios ;-)
Owner's reply
Updated to 1.10
Happy Nagios! :-)
byjbroome, December 6, 2011
If you adjust the grep that jisse44 suggests to Fail vs. the Failed he suggests you'll pick up a drive status of "Predictive Failure" as well.
Owner's reply
Fixed! Thanks
byjdecello, March 20, 2010
This one works out of the box. The check_hparray that is just like this one does not work with nagios3. The check_hparray.pl errors out on arrays of slot=0 (all of mine) and didn't give verbose output.
This one I just did a -v in nrpe.cfg, simple and detailed.... though it doesn't handle UNKNOWN quite right... was still green rather than other. not a big deal....
This one I just did a -v in nrpe.cfg, simple and detailed.... though it doesn't handle UNKNOWN quite right... was still green rather than other. not a big deal....
Owner's reply
Fixed! Thanks


New Listings


