Home Directory Plugins Hardware Storage Systems RAID Controllers RAID - Hardware [LSI/3ware/Areca/Adaptec] / Software arrays and ZFS monitoring

Search Exchange

Search All Sites

Nagios Live Webinars

Let our experts show you how Nagios can help your organization.

Contact Us

Phone: 1-888-NAGIOS-1
Email: sales@nagios.com

Login

Remember Me

Directory Tree

RAID - Hardware [LSI/3ware/Areca/Adaptec] / Software arrays and ZFS monitoring

Rating
2 votes
Favoured:
0
Current Version
0.3
Last Release Date
2019-10-15
Compatible With
  • Nagios 3.x
  • Nagios 4.x
  • Nagios XI
Owner
License
GPL
Hits
4542
Network Monitoring Software - Download Nagios XI
Log Management Software - Nagios Log Server - Download
Netflow Analysis Software - Nagios Network Analyzer - Download
I decided to post my own storage monitoring plugin as I found no good alternative, hope you find it useful!

This plugin will monitor health & missing components of:
- Hardware RAID (supported RAID controllers: LSI, 3ware, Areca, Adaptec):
- Controller itself, virtual drives, drive enclosures, physical drives in the RAID setup via S.M.A.R.T. reallocated sectors (0x05) value
- Software RAID via mdadm tool (arrays and drives)
- ZFS pools
- It will also monitor all the drives (via smartctl) detected in non-RAID / non-ZFS (or mixed) enviroments.
For this plugin to work you need to download RAID controller utilities and place them in /opt directory.

You can download the tools from official website or from my github: https://github.com/realasmo/bash/tree/master/nagios-plugins/check-storage/hwraid_utils

- - - - - - - - - - - - - - - - - - - - -

Example output:

Sofware RAID only (drive /dev/sdb in array md0 reports 716 realloc sectors):

[root@strdev3 ~]# ./check-storage.sh
[STORAGE][SWR]::Array:md2:Health: OK (state/failed_dev/removed_dev: clean/0/0):Array:md1:Health: OK (state/failed_dev/removed_dev: clean/0/0):Array:md0:Health: OK (state/failed_dev/removed_dev: clean/0/0):[STORAGE]drv:/dev/sda:Health: OK (realloc: 0):drv:/dev/sdb:Health: CRITICAL (realloc: 716):drv:/dev/sdc:Health: OK (realloc: 1):drv:/dev/sdd:Health: OK (realloc: 0):

Sofware RAID only (array md127 reports 4 removed drives):

[root@strdev5 ~]# ./check-storage.sh
[STORAGE][SWR]::Array:md127:Health: CRITICAL (state/failed_dev/removed_dev: active/0/4):[STORAGE]drv:/dev/sda:Health: OK (realloc: 0):drv:/dev/sdb:Health: OK (realloc: 0):

Hardware RAID (drive p0 reports ECC-ERROR):

[root@strdev2 ~]# ./check-storage.sh
[STORAGE][3Ware]::CTL: c0: Health: OK (NotOpt:0)::Unit: u0: Health: OK (Status: VERIFYING, type/size: RAID-10/1862.62GB)::Drive: p0: Health: CRITICAL (Status/ReallocSect: ECC-ERROR/0, VPort/Size/Type: p0/931.51GBGB/SATA)::Drive: p1: Health: OK (Status/ReallocSect: OK/0, VPort/Size/Type: p1/931.51GBGB/SATA)::Drive: p2: Health: OK (Status/ReallocSect: OK/0, VPort/Size/Type: p2/931.51GBGB/SATA)::Drive: p3: Health: OK (Status/ReallocSect: OK/0, VPort/Size/Type: p3/931.51GBGB/SATA)::[STORAGE]drv:0:Health: OK (realloc: 0):drv:1:Health: OK (realloc: 0):drv:2:Health: OK (realloc: 0):drv:3:Health: OK (realloc: 0):

Checks on unsupported RAID controller will be limited to drives & software arrays if present:

[root@strdev1 ~]# ./check-storage.sh
[HWR]:Found unsupported RAID card :: [STORAGE][SWR]::Array:md124:Health: OK (state/failed_dev/removed_dev: active/0/0):Array:md125:Health: CRITICAL (state/failed_dev/removed_dev: active/0/1):Array:md126:Health: CRITICAL (state/failed_dev/removed_dev: clean/0/1):Array:md127:Health: CRITICAL (state/failed_dev/removed_dev: clean/0/1):[STORAGE]drv:/dev/sdb:Health: OK (realloc: 0):drv:/dev/sdc:Health: OK (realloc: 0):drv:/dev/sdd:Health: OK (realloc: 0):drv:/dev/sde:Health: OK (realloc: 0):drv:/dev/sdf:Health: OK (realloc: 0):drv:/dev/sdg:Health: OK (realloc: 0):drv:/dev/sdh:Health: OK (realloc: 0):drv:/dev/sdi:Health: OK (realloc: 0):drv:/dev/sdj:Health: OK (realloc: 0):drv:/dev/sdk:Health: OK (realloc: 0):drv:/dev/sdl:Health: OK (realloc: 6):drv:/dev/sdm:Health: OK (realloc: 6):drv:/dev/sdn:Health: OK (realloc: 16):drv:/dev/sdo:Health: OK (realloc: 0):drv:/dev/sdp:Health: OK (realloc: 0):

Checks on ZFS pool:

[root@strdev4 ~]# ./check-storage.sh
[ZFS]::Health: CRITICAL (name/size/health: pool1/2.72T/DEGRADED):[STORAGE]drv:/dev/sda:Health: OK (realloc: S_NOATTR):drv:/dev/sdb:Health: OK (realloc: 0):drv:/dev/sdc:Health: OK (realloc: 0):drv:/dev/sdd:Health: OK (realloc: S_NOATTR):drv:/dev/sde:Health: OK (realloc: 0):drv:/dev/sdf:Health: OK (realloc: 0):drv:/dev/sdg:Health: OK (realloc: 0):
Reviews (1)
byWebpods, July 30, 2020
1 of 1 people found this review helpful
This plug-in provides so much useful information and to top it off, it works on main different RAID cards.

The output it provides helps me determine when it's time to replace a failing hard drive or when my software raid needs attention.

Highly recommended plug-in! Give this developer some feedback. Very well coded! Looking forward to more plugins from this developer.