Home Directory Plugins Hardware Storage Systems SAN and NAS EMC Clarion check_emc_clariion.pl maintained by BuddhaBob74

Search Exchange

Search All Sites

Nagios Live Webinars

Let our experts show you how Nagios can help your organization.

Contact Us

Phone: 1-888-NAGIOS-1
Email: sales@nagios.com

Login

Remember Me

Directory Tree

check_emc_clariion.pl maintained by BuddhaBob74

Rating
25 votes
Favoured:
6
Current Version
2014-05-06
Last Release Date
2014-05-06
Compatible With
  • Nagios 3.x
  • Nagios 4.x
  • Nagios XI
License
GPL
Hits
103989
Files:
FileDescription
check_emc_clariion.zipcheck_emc_clariion.pl maintained by Box293
Network Monitoring Software - Download Nagios XI
Log Management Software - Nagios Log Server - Download
Netflow Analysis Software - Nagios Network Analyzer - Download
This plugin allows you to monitor an EMC CLARiiON SAN.

You can monitor the following components of the SAN:
* Storage Processors = Status of each SP
* Storage Processors Information = Gets information on the SP (SP ID, Agent Revision, FLARE Revision, PROM Revision, Model/Type, Memory and Serial Number
* Storage Processors Busy Percentage = SP Busy % with performance data for graphing purposes
* Disks = Status of the Physical Disks attached in all the Disk Array Enclosures
* Cache = Status of the Read and Write Cache
* Faults = Report any Faults on the SAN
* Percentage Dirty Pages in Cache = % Dirty Pages in Cache with Performance Data for graphing purposes
* Port State = Status of the Ports on an SP
* HBA State = Status of a client's host bust adapter connection
* LUNs = Check the status of a specific LUN and reports State, ID, Name, Size, Free Space, RAID Group Type and Percentage Rebuilt
* RAID Group = Checks the status of a specific RAID Group and reports the State, ID, RAID Group Type, Logical Size, Free Space, Percentage Defragmentation Complete, Percentage Expansion Complete
* Storage Pool = Returns capacity usage information of the Storage Pool and reports the State, ID, RAID Type, Available Capacity, Consumed Capacity, Subscribed Capacity, Percentage Used and Percentage Free
* Temperature = Gets the inlet air temperature and returns Performance Data for graphing purposes

The monitoring of the EMC CLARiiON is performed by a MODIFIED version of the check_emc_clariion.pl script written by Michael Streb @ NETWAYS GmbH.
However I have made many changes to this script since so I have decided it best to release this as an alternate version. The version notes below highlight the changes that I have implemented.

Starting Jan 5 2014, this script is now maintained by Bob Goodfriend. Employed at IPSoft in Chicago.
Requirements:
There are a couple of components used that make all of this work.
EMC Linux Navisphere Server Software
* This is the software that communicates with the SAN and is what the plugin uses
Enable SNMP on the SAN
* Navisphere uses this to talk to the SAN
Create a Monitoring Account on the SAN
* For security reasons it's best to create a read only account for Navisphere to use

Download Linux Navisphere Server Software.
* Go here http://powerlink.emc.com
* Access to the Powerlink website requires you to have an account with EMC
* Usually this is provided as part of your support contract with EMC
* Once logged in navigate your way to:
* Support - Software Downloads and Licensing - Downloads J-O - Navisphere Server Software
* Find the section Linux Navisphere Server Software
* You need to download Navisphere Host Agent/CLI (Linux)
* For example: NaviCLI-Linux-64-x86-en_US-7.31.33.0.41-1.x86_64.rpm
* NOTE: There is a 32-bit and 64-bit version, the example above is the 64-bit version.
* If you can't find this download or section then you need to contact EMC as your account will only have access to downloads that your account is registered for.

Installation of Linux Navisphere Server Software.
* Save NaviCLI-Linux-64-x86-en_US-7.31.33.0.41-1.x86_64.rpm to /tmp on your Nagios host.
* Then run:
* yum install NaviCLI-Linux-64-x86-en_US-7.31.33.0.41-1.x86_64.rpm
* This will install the software and any required dependancies.

Enable SNMP on the EMC SAN
* This is done using the Navisphere web console
* Open a web browser to one of the SANs SP IP Address
* Login as an administrator
* Expand the tree of your SAN and select one of the SPs
* Right click the SP and select Properties
* Click the Network tab
* Tick the box Enable/Disable processing of SNMP MIB read requests
* Click Apply and then OK
* Repeat this step for each SP in your SAN
* You can leave Navisphere open as we will use it with the next step

Create a Monitoring Account
* Continuing with your Navisphere web console session
* Click the pull down menu Tools and select Security - User Management...
* Click the Add button
* Username: readonly
* Role: monitor
* Global/Local: global
* Password: type a strong password
* Click OK
* Click Yes to add the new user
* Click OK and then OK again

This completes all the steps required for the plugin to work.

Additionally you can use the --secfilepath option that allows you to use a directory that has the security credentials encrypted in some files. You will need to make a directory first to store the files, and then run a command to create the security files in this directory.

* The following example will create the directory /usr/local/nagios/libexec/check_emc_clariion_security_files for storing the security files.
* You will need to change the username and password to match the credentials you use to connect to the emc storage processor.
* Run these commands to create the security files:
* mkdir /usr/local/nagios/libexec/check_emc_clariion_security_files
* /opt/Navisphere/bin/naviseccli -secfilepath /usr/local/nagios/libexec/check_emc_clariion_security_files -User readonly -Password AStrongPassword -Scope 0 -AddUserSecurity


Command Line Examples:
Status of All Disks
check_emc_clariion.pl -H 192.168.5.1 -u readonly -p AStrongPassword -t disk
check_emc_clariion.pl -H 192.168.5.1 --secfilepath /usr/local/nagios/libexec/check_emc_clariion_security_files -t disk

Status of Any Faults
check_emc_clariion.pl -H 192.168.5.1 -u readonly -p AStrongPassword -t faults
check_emc_clariion.pl -H 192.168.5.1 --secfilepath /usr/local/nagios/libexec/check_emc_clariion_security_files -t faults

SPA Percentage Busy
check_emc_clariion.pl -H 192.168.5.1 -u readonly -p AStrongPassword -t sp_cbt_busy --sp A --warn 50 --crit 70
check_emc_clariion.pl -H 192.168.5.1 --secfilepath /usr/local/nagios/libexec/check_emc_clariion_security_files -t sp_cbt_busy --sp A --warn 50 --crit 70


Setup Examples:

define command {
command_name check_emc_clariion
command_line $USER1$/check_emc_clariion.pl -H $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ $ARG7$ $ARG8$
}


SPA Percentage Busy
define service {
use generic-service
host_name EMC_SAN
service_description SPA Percentage Busy
check_command check_emc_clariion!$HOSTADDRESS$!-u readonly!-p AStrongPassword!-t sp_cbt_busy!--sp A!--warn 50!--crit 70
max_check_attempts 3
check_interval 3
retry_interval 3
register 1
}


define service {
use generic-service
host_name EMC_SAN
service_description SPA Percentage Busy
check_command check_emc_clariion!$HOSTADDRESS$!--secfilepath /usr/local/nagios/libexec/check_emc_clariion_security_files!-t sp_cbt_busy!--sp A!--warn 50!--crit 70
max_check_attempts 3
check_interval 3
retry_interval 3
register 1
}


License:
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Help:
To see the help type:
check_emc_clariion.pl --help | more

License:
To see the license type:
check_emc_clariion.pl --license | more


Version Notes:
2011-03-24
* Modified plugin to perform a Percentage Of Dirty Pages check that returns performance data (cache_pdp).

2011-03-28
* Modified plugin to perform SP Busy and SP Idle checks that returns performance data (sp_busy and sp_idle).

2011-05-08
* Modified plugin to peform an SP Busy check [that uses the controller busy and idle ticks] and returns performance data (sp_cbt_busy). This is more accurate than the (sp_busy and sp_idle) method.

2011-06-29
* Modified sp_cbt_busy to check for negative numbers in the data obtained from the SAN.

2011-07-04
* Modified sp_cbt_busy to ensure calculated value does not exceed 100%.

2012-03-06
* Modified check_disk to look for Removed drives, this was missing. Also removed a double || symbol in the same section.
* Modified check_portstate to include code supplied from Federspiel Till. Problem occurred when all ports were checked, the error_count was being incorrectly determined.

2012-10-23
* Corrected POD formatting to fix POD ERRRORS.
* Added error checking to ensure we are getting expected results from the Navisphere CLI app.

2012-12-05
* Updated plugin to check for navicli or naviseccli, it will use naviseccli if present. Also makes sure that the username and password arguments have been provided. This fixes a problem with newer releases of Navisphere that only come with naviseccli (reported by Charles Breite).

2012-12-11
* Fixed bug in portstate check, it is now performing a regex that is not case sensative (reported by Charles Breite).
* Added a check to detect if the user did not provide any options, if not it will display the help.

2012-12-19
* Updated code to prevent errors if required arguments are missing, if so it will display the help.
* Updated the help to include information about Secure vs Non-Secure and also provided several examples.

2013-01-25
* Added Storage Processors Information check
* Added LUNs check
* Added RAID Groups check
* Added functionality that will pause for 7 seconds if an error occurs before showing the help text, this gives you time to read the error message
* Added error checking for "Could not connect to the specified host"
* Updated SP check to account for enclosures which return certain parts (Fans etc) with a status of N/A
* Fixed bug in the disk check that was counting Empty disk slots as disks
* Fixed bug "Illegal division by zero" error when running the sp_cbt_busy check
* Added information to the Help about what states will be returned for each check
* If warn or crit values are incorrect or not present when the arguments are, only an error is displayed, the help is not displayed
* Added full GNU license

2013-01-28
* Fixed RAID Group type being identified as hot_spare instead of 'Hot Spare'

2013-01-30
* Removed space from performance data string for LUN and RAID Group checks

2013-02-09
* Fixed bug in LUN and RAID Group checks, they were not triggering correctly on the warning and critical thresholds
* Updated the help to explain how the warning and critical thresholds are triggered as the existing help was not very clear

2013-03-09
* Plugin updated to incorporate new functionality of using a credentials file instead of supplying a username and password. This code was supplied by Uwe Kirbach
* Fixed a bug that was caused by older versions of perl and the use of switch statements. Changed these switch statements to if elsif statements to allow plugin to run on older versions of perl

2013-12-19
* Added an option to set min spare disk expected (--minspare ) [updated plugin supplied by Yannig Perre]

2014-05-06
* Added a check to get the inlet air temperature as a nagios perf metric (Contributed by Max Vernimmen from www.comparegroup.eu
* Fixed duplicate port bug when checking just one port. Can now check several specific ports at one time like --port 1,3. (Port fixes contributed by Stanislav German-Evtushenko)
* Added a check for reporting on Storage Pools (requested and tested by Vitaly Burshteyn, tested by Stanislav German-Evtushenko)
Reviews (19)
The plugin command "cache" doesn't seem to work (any more?) with VNX 5200: "Error: getcache command failed"

As a quick hack, I added an option
'T=s' => \$opt_vnx_type,
in the global section and a variable/condition in the sub check_cache:
> my $naviclicmd = "getcache";
> if ($opt_vnx_type =~ m/VNX5200/i) { $naviclicmd = "cache -sp -info"; }
> open (NAVICLIOUT ,"$NAVICLI_CMD -h $opt_host $naviclicmd |");

Now we can use it with both VNX types 5200 and 5300.
byarigaud, October 8, 2015
1 of 1 people found this review helpful
Here is a new version.

Notes : Added a function to check exit value of commands (check_for_errors was useless).
# Added debug option to displaying navicli return.
# Added output option (nagios states, one line stdout for faults only)
# Added Timeout option, default is 10 sec.
I hope you get this message.

The reason why you're getting different outputs from the browser than what's in the cli is because you could be using different user accounts for both. From the the browser itself, my best guess is the plugin is being executed as the 'nagios' user and you're using your own user account from the cli.

If you try running the script, from the cli, as the service account that is running Nagios I bet you'll get the same output that your browser is displaying.

It could have something to do with 'secfile'. Create/setup one as the service account used by Nagios and you should be set.

####
Nice script btw, this helps a lot in monitoring our storages devices.
bypc-dok, September 30, 2014
Hi All

I have add a check command to this script, i add the Option "lun -list -state" as check Option -t lun_info! When Box293 is interessted i can upload it!

Regards
Franco
This is a great script and works well apart from the -t sp_info. From the command line I get:

./check_emc_clariion.pl -H ***.***.***.*** --secfilepath /usr/local/nagios/libexec/auth/emc_auth_files -t sp_info
{SP ID:A} {Agent Revision:6.29.5 (0.55)} {FLARE Revision:04.29.000.5.006} {PROM Revision:4.80.00} {Model:CX4-120, Rackmount} {Memory:2.97GB} {Serial Number:****************}

When I look in Nagios I get this

Status Information: {SP ID:} {Agent Revision:} {FLARE Revision:} {PROM Revision:} {Model:, } {Memory:} {Serial Number:}

The returned items are being excluded in the output being parsed to Nagios.

Any thoughts? I am using the latest version of the script.

Thanks and keep up the great work

Chris
bylkieffer, March 6, 2014
1 of 1 people found this review helpful
Hello,

I have add the following code to process hotspare policy of VNX2:

# add for VNX2
my $policy_line = 0;
open (NAVICLIOUT ,"$NAVICLI_CMD -h $opt_host hotsparepolicy -list |");

while () {
# First lets check for errors before proceeding
check_for_errors($_);

# check for policy lines
if( $_ =~ m/^Policy ID:/) {
$policy_line=1;
}
if ($policy_line == 1) {
# check for hot spare lines
if( $_ =~ m/^Unused disks for hot spares:s+(.*)$/) {
$hotspare_count=$hotspare_count+$1
}
}
# end of section
if ( $_ =~ m/^s*$/) {
$policy_line=0;
}
}
close (NAVICLIOUT);
From the command line is working perfectly. From Nagios Web interface I get the error "Service check did not exit properly". Any workaround/suggestion?
Hi there
I was trying to set-up your script on our nagios-installation. I followed your instruction but was wondering about the Version of NaviCLI you mention. The newest version i can find is "NaviCLI-Linux-64-x86-en_US-2.30.15.44-1.x86_64.rpm". You are speaking of a 2.31-Version.
I debugged a few of your checks and found the following:
The CLI-output of cache_pdp returns no % of Dirty pages, only MB's. You are regex'ing for %.
Do you have any idea?

Chris
Hi,

I had problem with checking last HBA (last in navisecli output) of our VNX 5300, because plugin continued to section "Information about each SPPORT" and inserted a lot of nonsense information to its output.
I solved it by adding

if ($_ =~ m/InformationsaboutseachsSPPORT/) {
$hba_section = 0;
$hba_port_section = 0;
$hba_node_line = 0;
}

before

}
close (NAVICLIOUT);

Regards, Bohumil
Owner's reply

This has been included in version 2014-05-06. Cheers

byKoodbook, May 7, 2013
Hi!
This pluggin will be very useful for my supervision's plateform.

Someone could helpme? I have done all liked it's described and when I execute :

./check_emc_clariion.pl -H xxx.xxx.xxx.xxx -u USER -p PASSWORD -t disk

Error returned from the Management Server on xxx.xxx.xxx.xxx
Very nice plugin, but we regularly get critical errors because the service check timed out, while we are 100 % sure there are no issues.
Can anyone guide me in a direction how to change the timeout?
Should I change the $clitimeout in the .pl file?
# timeout in seconds for calls to navi(sec)climy $clitimeout;$clitimeout=18;
byShodan, April 2, 2013
Thank you!

I just had to modify a couple of lines:

I added
# nagios -epn
just after the interpreter declaration so Nagios won't try to use its embedded perl interpreter (which exits with some compile errors)

I also had to move "Unbound" from line 895 to line 891.
For some reason my AX4-5 reports the Hot Spare disk as Unbound making the script report 0 HS available and triggering a warning.
byginer, March 18, 2013
1 of 1 people found this review helpful
Great plugin!
I have fixed a couple of bugs.

1. No more duplicates when check one port (with a --port option) and not all of them.
2. Now is able to check several ports at one time. Syntax example: poststate --SP A --port 1,3.


--- check_emc_clariion.pl.orig 2013-03-09 14:54:48.000000000 +0100
+++ check_emc_clariion.pl 2013-03-19 08:25:43.000000000 +0100
@@ -1103,7 +1103,7 @@
$sp_line = 1;
}
# check for requested port id
- if ($opt_port >=0) {
+ if ($opt_port =~ /^d+$/) {
if( $_ =~ m/^SPsPortsID:s+($opt_port)$/) {
$port_id = $1;
$portstate_line = 1;
@@ -1165,14 +1165,11 @@
$output .= "Connection Type: $type. ";
}
# end of section
- if ( $_ =~ m/^s*$/) {
- $portstate_line = 0;
- ### $sp_section = 0;
- if ($opt_port >=0 ) {
- $sp_section = 0;
- }
- $sp_line = 0;
- }
+ }
+ if ( $_ =~ m/^s*$/) {
+ $portstate_line = 0;
+ $sp_section = 0;
+ $sp_line = 0;
}
}
close (NAVICLIOUT);
Owner's reply

This has been included in version 2014-05-06. Cheers

byrafaeljesus, March 15, 2013
I would like to thank the maintainer for supporting!
I'm using the lastest release and it's working fine! More features like read/write cache, frontend and backend usage will be great, but CLARiiON AX4 is a low-end storage system and I don't think it's possible.
I was also receiving the following error:

./check_emc_clariion.pl -H XXXXXXX -u XXXXXXX -p XXXXXX -t disk
syntax error at ./check_emc_clariion.pl line 1316, near "){"
syntax error at ./check_emc_clariion.pl line 1336, near "}"
Execution of ./check_emc_clariion.pl aborted due to compilation errors.

Resolution:
Check what version of Perl you're running (terminal command: Perl -v). I was on 5.8.8 and was receiving this error. The error seems to be coming from the switch.pm module and believe it may have been fixed in Perl 5.10.X. However, the switch module has been deprecated in Perl v 5.14. So to get this plugin functioning you can either update Perl to a version just prior to 5.14 or update to the newest version of Perl and install the legacy Switch module separately.

Other than this issue, this plugin is working great. Thanks for all the efforts in maintaining it.

Hope this helps!
byvinz, February 26, 2013
2 of 2 people found this review helpful
Hello, im encountering this error when running this plugin on my nagios

./check_emc_clariion.pl -H XXXXXXX -u XXXXXXX -p XXXXXX -t disk
syntax error at ./check_emc_clariion.pl line 1315, near "){"
syntax error at ./check_emc_clariion.pl line 1335, near "}"
Execution of ./check_emc_clariion.pl aborted due to compilation errors.

Hope anyone can help me. Im not good at perl script. Thanks
Owner's reply

Hi vinz,
I have corrected the problem you described above in release 2013-03-09. This was caused by older versions of perl and the use of switch statements. I changed these switch statements to if elsif statements to allow plugin to run on older versions of perl. Please email me if you are still having problems getting this to work.

bycccbbb, February 5, 2013
I have been struggling for awhile to get decent monitoring of our EMC SAN. This is plugin is GREAT!
I now monitor All disks, Dirty Cache pages, Percentage busy, Any Faults and the SP's....been waiting for this one! A Must Have Plugin and if your using Nagios XI the wizard is great!
bymoatib, February 4, 2013
I am using it and it is working fine.
Need more features (like performance io disk, etc ...) and he will be perfect tool to monitor your EMC Clariion.
bydavesykes, January 11, 2013
1 of 1 people found this review helpful
I've got this working with an AX4.
Only issues were with the check_sp method, the checks for enclosures need to ignore some parts (Fans and LCC) which return N/A as status. I've added |N on the end of the regex on line 323. And is working OK now

Thank you

Dave
Owner's reply

Hi Dave, thanks for reporting. I've fixed this in the new 2012-01-25 version.