Search Exchange
Search All Sites
Nagios Live Webinars
Let our experts show you how Nagios can help your organization.Login
Directory Tree
check-multipath.pl
File | Description |
---|---|
check-multipath.pl | vs 0.4.12 (Report messages (e.g."'multipathd' service is not running") from multipath call as WARNINGs) |
TEST_check-multipath.pl | Check output against expected results vs. 0.4.12 |
You might not need most of them.
The default values aim at analysing a typical standard configuration with 4 paths per LUN (2 SCSI-hosts, 2 SCSI-ids).
The optional parameters --extraconfig and --group were introduced for more complex environments, where single LUNs or groups of LUNs need different check parameters than the rest.
A configuration for a specific LUN via --extraconfig has highest priority and overrides group and global config.
If no --extraconfig entry matches the LUN attributes and a regex defined in --group matches a LUN line, the specified group values are used (first matching regex in List, checked from left to right. Any part of the LUN line can be used, see "multipath -l" output and description below)
Otherwise the global defaults (see other parameters) are used.
The plugin performs an extra check for a running multipathd process. Additional messages from the multipath call should be reported as WARNINGs.
The first version dates back to 2011. Please note that by now I can offer only limited support.
- Hinnerk Rümenapf
NOTE:
(1) Internal drives SHOULD NOT be handled by the multipath driver. They SHOULD be excluded in the multipath configuration. See multipath documentation for details.
(2) Though additional messages from the multipath call should be reported as WARNINGs, the error message "line not recognised" from the plugin can be caused by errors (e.g. errors in the multipath configuration) reported by the multipath command. In this case please call "multipath -l" directly on the command line and check the error message (the output line in question is also included in the error message of the plugin). If a direct call of "multipath -l" on the command line does NOT give an error message, it is most likely a parse error of the plugin and please drop me a line if this occurs.
Options:
-m, --min-paths Low mark, less paths per LUN are CRITICAL [2]
-o, --ok-paths High mark, less paths per LUN raise WARNING [4]
-n, --no-multipath Exitcode for no LUNs, no multipath driver and multipathd not running [warning]
-M, --mdskip Skip extra check for a running multipathd process (check uses '--no-multipath' returncode)
-a, --addchecks define low/high marks for additional checks
number of policies per LUN "p,LOW,HIGH", DEFAULT: p,0,0
number of scsi-hosts per LUN "sh,LOW,HIGH", DEFAULT: sh,0,0
number of scsi-ids per LUN "si,LOW,HIGH", DEFAULT: si,0,0
e.g. 'p,1,2,sh,1,2'; 'si,1,2,p,1,2,sh,2,4'; 'p,1,2'
See documentation of multipath output. If the HIGH value is 0, the check is skipped.
A typical standard-configuration uses 2 scsi-hosts and 2 scsi-ids, resulting in four paths
representing all possible combinations: h0-i0, h0-i1, h1-i0, h1-i1.
--scsi-all Count all scsi-hosts and scsi-ids, even from paths that report an error state.
-L, --ll use multipath -ll instead of multipath -l
Can give more detailed information, helps to detect failed paths with older versions of multipath tools (RHEL 5, ...)
-r, --reload force devmap reload if status is WARNING or CRITICAL
(multipath -r)
Can help to pick up LUNs coming back to life.
-g, --group Specify regular expression (Perl notation) to identify groups of LUNs with other default-thresholds.
Overrides global config for LUNs with LUN lines that match a group regular expression.
In most cases a simple String should be sufficient. NOTE: special regular expression characters must be escaped! Whitespace is significant.
"LUN_LINE_REGEX,LOW,HIGH[@#,ADDCHECKS]:" for each group with deviant thresholds (see also explanation of --addchecks)
e.g. "IBM,ServeRAID,1,1:HAL,ChpRAID,1,2:" or "IBM,ServeRAID,1,1@#,p,1,2:HAL,ChpRAID,1,2@#,sh,1,2,si,1,2:"
Use command "multipath -l" to see the LUN lines and to identify groups.
-e, --extraconfig Specify different low/high thresholds for LUNs.
Overrides group and global config for the specified LUNs.
optional: specify return code if no data for LUN selector was found
(ok, warning, critical), default is warning
the return code MAY be followed by definitions of additional checks, see explanation of --addchecks above
"LUN-selector,LOW,HIGH[,RETURNCODE[,ADDCHECKS]]:" for each LUN with deviant thresholds
e.g. "iscsi_lun_01,2,2:dummyLun,1,1,ok:paranoid_lun,8,16,critical:"
"oddLun,3,5:"
"paranoidOddLun,5,11,critical,p,3,5,sh,5,9,si,3,7:"
"default,2,4,warning:DonalLunny,6,8,warning,sh,1,4,si,1,4:"
LUN-selector is by default checked against the "generic Name", as used in older plugin versions.
You can specify a prefix to select a LUN attribute as identifier.
Not all attributes may be available, depending on the specific multipath configuration.
Use command "multipath -l" to see the complete LUN lines.
"G!" generic name, as used in older versions. Exists always. Content depends on the specific configuration. DEFAULT
"W!" WWID as reported by the multipath command
"D!" dm Identifier (dm-3 or similar)
"N!" user-friendly name
e.g. 'W!36000d774000045f655ea91cb4ea41d6f,4,8,critical:DonalLunny,6,8:D!dm-3,1,2,warning,sh,1,2,si,1,2:'
NOTE: enclose parameter value in SINGLE-quotes for this notation!
-p, --print List to determine which attribute of the LUN should be printed as identifier in the output
The letters in the list are checked from left to right, the first coresponding attribute that exists is printed.
The letter G is always appended to the list.
Avalible are:
G: generic name, as used in older versions. Exists always. Content depends on the specific configuration.
W: WWID as reported by the multipath command
D: dm Identifier (dm-3 or similar)
N: user-friendly name
e.g. "DN": print dm-identifier (if present), else user friendly name (if present) else generic name (as G is always appended to the list)
"WDN": print WWID (if present), else print dm-identifier (if present), else user friendly name (if present) else generic name (as G ist always appended to the list)
-l, --linebreak define end-of-line string:
REG regular UNIX-Newline
HTML br-Tag (HTML-Linebreak)
-other- use specified string as linebreak symbol, e.g. ', ' (all in one line, comma seperated)
-s, --state Prefix alerts with alert state
-S, --short-state Prefix alerts with alert state abbreviated
-h, --help Display this help text
-V, --version Display version info
-v, --verbose
-d, --di Run testcase instead of real check [0]
-t, --test Do not display testcase input, just result
System configuration for 'sudo' must allow the Nagios user to call the command 'multipath -l' and/or 'multipath -ll' if you use the --ll option (and also 'multipath -r' if you intend to use the --reload option) *without* password.
Several testcases to are included in the script. They can be called by using the parameter --di with numerical values other than zero.
The additional Script 'TEST_check-multipath.pl' can be used for regression tests if you feel like editing the plugin.
This plugin was written by Hinnerk Rümenapf and is based on work by:
- Trond H. Amundsen [t.h.amundsen@usit.uio.no]
- Gunther Schlegel [schlegel@riege.com]
- Matija Nalis [mnalis+debian@carnet.hr]
Thanks to Bernd Zeimetz, Sven Anders, Kai Groshert, Ernest Beinrohr, Sébastien Maury, Benjamin von Mossner, Michal Svamberg, Andreas Steinel, Severin Launiau, Christian Zettel, Jeffrey Honig, Jeffrey Honig and Ben Evans for testing and contributions and to Matthew Castanien, Dmitry Sakoun, Robert Towster, Tom Schier, Jim Clark, Ivan Zikyamov, Philip Morales and Ricardo Guijt for their comments.
SOFTWARE IS PROVIDED AS-IS, WITHOUT ANY WARRANTY
#multipath -ll
lun-name (xxxx) dm-6 NETAPP ,LUN C-Mode
size=400G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:0 sdb 8:16 active ready running
| |- 1:0:2:0 sdd 8:48 active ready running
| |- 2:0:2:0 sde 8:64 active ready running
| `- 2:0:4:0 sdg 8:96 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 1:0:5:0 sdf 8:80 active ready running
|- 1:0:7:0 sdh 8:112 active ready running
|- 2:0:1:0 sdc 8:32 active ready running
`- 2:0:5:0 sdi 8:128 active ready running
Apr 08 09:21:04 | multipath device maps are present, but 'multipathd' service is not running
Apr 08 09:21:04 | IO failover/failback will not work without 'multipathd' service running
Up to version 0.4.11 the plugin could not handle those messages from the multipath call. The Issue should be fixed from version 0.4.12 onwards, I'm just (2022-04-12) testing a release candidate.
device {
vendor "AIX"
product "VDASD"
path_grouping_policy "multibus"
path_checker "directio"
features "0"
hardware_handler "0"
prio "const"
failback "immediate"
rr_weight "uniform"
no_path_retry 60
}
your script fails at
dbssapbwh:/etc # /usr/local/nagios/libexec/check-multipath.pl -m 1 -o 2 -n critical -L -s
ERROR: Line 1 not recognised. Expected path info, new LUN or nested policy:
'Jun 20 13:50:10 | multipath.conf +568, invalid keyword: device' |Host: dbssapbwh|
without that it works.
The error is NOT caused by the plugin. The error is in your file /etc/multipath.conf and it is reported by the multipath command. Try calling multipath -l directly, you should see the error message 'multipath.conf +568, invalid keyword: device' with the configuration listed above.
1. The output redirection adds "" extra characters to the end of line. See below example.
# ./check-multipath.pl -s
CRITICAL: LUN mpathb: less than 2 paths (1/4)!
CRITICAL: LUN mpatha: less than 2 paths (1/4)!
# ./check-multipath.pl -s > /tmp/1
# cat /tmp/1
CRITICAL: LUN mpathb: less than 2 paths (1/4)!CRITICAL: LUN mpatha: less than 2 paths (1/4)!root@seieadb93#
#
===========================
2. Can the internal drives (Non SAN) like HP Proliant servers internal SMART array based RAID drives. The multipath -ll command for such internal drive is shown below
# ./check-multipath.pl -s
CRITICAL: LUN mpathb: less than 2 paths (1/4)!
#
# multipath -ll mpathb
mpathb (3600508b1001c3718d6b6ba8fec4cf14b) dm-2 HP,LOGICAL VOLUME
size=137G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 0:0:0:1 sdb 8:16 active ready running
1) I'm not shure about this. The ! character is part of the message. The line end charater (new line by default) can be changed by using the parameter -l or --linebreak.
2) Internal drives SHOULD NOT be handled by the multipath driver. Internal drives SHOULD be excluded in the multipath configuration. See multipath documentation.