Search Exchange
Search All Sites
Nagios Live Webinars
Let our experts show you how Nagios can help your organization.Login
Directory Tree
check_hparray
A Nagios plugin written in Bourne Shell (sh) that checks HP Proliant hardware raid via the HPACUCLI tool. HPACUCLI is a part of the HP Support Package and is available from the http://www.hp.com.
Tested on Red Hat Enterprise Linux 4, should work with most *nix based operative systems.
In your nrpe.cfg add for example:
commandcheck_hpraid_slot1=/opt/nrpe/libexec/check_hparray -s 1
commandcheck_hpraid_slot2=/opt/nrpe/libexec/check_hparray -s 2
Will be look something like this in Nagios Status information:
---snipp---
RAID OK - (Smart Array P400 in Slot 1 array A logicaldrive 1 (68.3 GB, RAID 1+0, OK) array B logicaldrive 2 (68.3 GB, RAID 1+0, OK))
---snipp---
NOTE!
HPACUCLI needs administrator rights.
Add this line to /etc/sudoers:
nagios ALL=NOPASSWD: /usr/sbin/hpacucli
Tested on Red Hat Enterprise Linux 4, should work with most *nix based operative systems.
In your nrpe.cfg add for example:
commandcheck_hpraid_slot1=/opt/nrpe/libexec/check_hparray -s 1
commandcheck_hpraid_slot2=/opt/nrpe/libexec/check_hparray -s 2
Will be look something like this in Nagios Status information:
---snipp---
RAID OK - (Smart Array P400 in Slot 1 array A logicaldrive 1 (68.3 GB, RAID 1+0, OK) array B logicaldrive 2 (68.3 GB, RAID 1+0, OK))
---snipp---
NOTE!
HPACUCLI needs administrator rights.
Add this line to /etc/sudoers:
nagios ALL=NOPASSWD: /usr/sbin/hpacucli
Reviews (5)
bydbentley, October 28, 2014
False positives. I had a failed drive, this script said all was dandy. I found out I had a bad drive by seeing the yellow indicator when out in the datacenter, then double checked my script and it says it was all good while a drive was indicating it was FAILED when manually running the hpacucli command.
Use check_cciss instead, much more reliable and even detects failed batteries, no need to setup different checks for multiple slots as well.
Use check_cciss instead, much more reliable and even detects failed batteries, no need to setup different checks for multiple slots as well.
bylars@kantega.no, February 8, 2012
I`ve made some changes to make this plugin work on CentOS 5 and CentOS 6. I`m checking on DL380 G4, DL360 G5, DL360 G7
What I`ve done is:
Installed ProLiant Support Pack (hpacucli)
* Disabled selinux
recovery >> recover in the script. this is to properly match on status rebuilding.
nagiosuser ALL=(ALL) NOPASSWD:/usr/sbin/hpacucli
What I`ve done is:
Installed ProLiant Support Pack (hpacucli)
* Disabled selinux
recovery >> recover in the script. this is to properly match on status rebuilding.
nagiosuser ALL=(ALL) NOPASSWD:/usr/sbin/hpacucli
use check_cciss.
check_hparray WILL provide false positives!!
Note to the author - thank you, the tool seems to be the basis for other tools. Someone had to make the first tool. You check for OK before you check for failed :-(
Here we have a server with a failed battery and a failed array, yet check_hparray reports to nagios that all is well.
[nagios@alpha plugins]$ ./check_hparray -s 1
RAID OK - (Smart Array P800 in Slot 1 array A logicaldrive 1 (68.3 GB, RAID 1, OK) array B (Failed) logicaldrive 2 (1.2 TB, RAID 1+0, Interim Recovery Mode))
[nagios@alpha plugins]$ echo $?
0
Here is another server with a failed array that reports all is well:
-bash-3.2$ ./check_hparray -s 1
RAID OK - (Smart Array P800 in Slot 1 array A (Failed) logicaldrive 1 (68.3 GB, RAID 1, Interim Recovery Mode) array B logicaldrive 2 (1.2 TB, RAID 1+0, OK))
check_hparray WILL provide false positives!!
Note to the author - thank you, the tool seems to be the basis for other tools. Someone had to make the first tool. You check for OK before you check for failed :-(
Here we have a server with a failed battery and a failed array, yet check_hparray reports to nagios that all is well.
[nagios@alpha plugins]$ ./check_hparray -s 1
RAID OK - (Smart Array P800 in Slot 1 array A logicaldrive 1 (68.3 GB, RAID 1, OK) array B (Failed) logicaldrive 2 (1.2 TB, RAID 1+0, Interim Recovery Mode))
[nagios@alpha plugins]$ echo $?
0
Here is another server with a failed array that reports all is well:
-bash-3.2$ ./check_hparray -s 1
RAID OK - (Smart Array P800 in Slot 1 array A (Failed) logicaldrive 1 (68.3 GB, RAID 1, Interim Recovery Mode) array B logicaldrive 2 (1.2 TB, RAID 1+0, OK))
hpacucli shows a "Failed" disk but plugin says that "RAID: OK" and exits with 0
----- hpacucli output -----
hpacucli controller slot=0 show config
Smart Array P400 in Slot 0 (Embedded) (sn: PAFGL0T9SXU2RB)
array A (SATA, Unused Space: 0 MB)
logicaldrive 1 (232.9 GB, RAID 1, OK)
physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SATA, 250 GB, OK)
physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SATA, 250 GB, OK)
array B (SATA, Unused Space: 0 MB)
logicaldrive 2 (1.8 TB, RAID 1+0, Interim Recovery Mode)
physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SATA, 1 TB, OK)
physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SATA, 1 TB, OK)
physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SATA, 1 TB, OK)
physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SATA, 0 MB, Failed)
------ check_hparray output ------
RAID OK - (Smart Array P400 in Slot 0 (Embedded) array A logicaldrive 1 (232.9 GB, RAID 1, OK) array B logicaldrive 2 (1.8 TB, RAID 1+0, Interim Recovery Mode))
ZsZs
----- hpacucli output -----
hpacucli controller slot=0 show config
Smart Array P400 in Slot 0 (Embedded) (sn: PAFGL0T9SXU2RB)
array A (SATA, Unused Space: 0 MB)
logicaldrive 1 (232.9 GB, RAID 1, OK)
physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SATA, 250 GB, OK)
physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SATA, 250 GB, OK)
array B (SATA, Unused Space: 0 MB)
logicaldrive 2 (1.8 TB, RAID 1+0, Interim Recovery Mode)
physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SATA, 1 TB, OK)
physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SATA, 1 TB, OK)
physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SATA, 1 TB, OK)
physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SATA, 0 MB, Failed)
------ check_hparray output ------
RAID OK - (Smart Array P400 in Slot 0 (Embedded) array A logicaldrive 1 (232.9 GB, RAID 1, OK) array B logicaldrive 2 (1.8 TB, RAID 1+0, Interim Recovery Mode))
ZsZs
it works sort of. It worked fine on the command line, even as user nagios (so yes, I did setup /etc/sudoers correctly). However, it didn't work from nagios3 nrpe...
Also, having to hard code slots isn't very convinient...... there is a better check.
I tried another one like this, in perl, but that one was broken too.... didn't support slot=0.
The best one, was this one, check_hpccis written by Simone Rosa. That one gave verbose output -v, scanned all arrays... very similiar to this one, yet that one worked out of the box with nagios3.
Also, having to hard code slots isn't very convinient...... there is a better check.
I tried another one like this, in perl, but that one was broken too.... didn't support slot=0.
The best one, was this one, check_hpccis written by Simone Rosa. That one gave verbose output -v, scanned all arrays... very similiar to this one, yet that one worked out of the box with nagios3.