Home Directory Plugins Operating Systems Linux check_iostat - I/O statistics

Search Exchange

Search All Sites

Nagios Live Webinars

Let our experts show you how Nagios can help your organization.

Contact Us

Phone: 1-888-NAGIOS-1
Email: sales@nagios.com

Login

Remember Me

check_iostat - I/O statistics

Rating
19 votes
Favoured:
1
Hits
285823
Files:
FileDescription
check_iostatcheck_iostat v0.0.2
Network Monitoring Software - Download Nagios XI
Log Management Software - Nagios Log Server - Download
Netflow Analysis Software - Nagios Network Analyzer - Download
This plugin shows the I/O usage of the specified disk, using the iostat external program. It prints three statistics: Transactions per second (tps), Kilobytes per second read from the disk (KB_read/s) and and written to the disk (KB_written/s)
This simple plugins uses iostat to obtain it's metrics, parses it, and uses bc for comparing the results with the specified WARNING and CRITICAL levels (since the shell can't compare floating point numbers).

Feedbacks/suggestions are appreciated =)
Reviews (15)
I've added the warning/critical thresholds to the performance data.

#!/bin/bash
#----------check_iostat.sh-----------
#
# Version 0.0.2 - Jan/2009
# Changes: added device verification
#
# by Thiago Varela - thiago@iplenix.com
#
# Version 0.0.3 - Dec/2011
# Changes:
# - changed values from bytes to mbytes
# - fixed bug to get traffic data without comma but point
# - current values are displayed now, not average values (first run of iostat)
#
# by Philipp Niedziela - pn@pn-it.com
#
# Version 0.0.4 - April/2014
# Changes:
# - Allow Empty warn/crit levels
# - Can check I/O, WAIT Time, or Queue
#
# by Warren Turner
#
# Version 0.0.5 - Jun/2014
# Changes:
# - removed -y flag from call since iostat doesn't know about it any more (June 2014)
# - only needed executions of iostat are done now (save cpu time whenever you can)
# - fixed the obvious problems of missing input values (probably because of the now unimplemented "-y") with -x values
# - made perfomance data optional (I like to have choice in the matter)
#
# by Frederic Krueger / fkrueger-dev-checkiostat@holics.at
#
# Version 0.0.6 - Jul/2014
# Changes:
# - Cleaned up argument checking, removed excess iostat calls, steamlined if statements and renamed variables to fit current use
# - Fixed all inputs to match current iostat output (Ubuntu 12.04)
# - Changed to take last ten seconds as default (more useful for nagios usage). Will go to "since last reboot" (previous behaviour) on -g flag.
# - added extra comments/whitespace etc to make add readability
#
# by Ben Field / ben.field@concreteplatform.com
#
# Version 0.0.7 - Sep/2014
# Changes:
# - Fixed performance data for Wait check
#
# by Christian Westergard / christian.westergard@gmail.com
#
# Version 0.0.8 - Jan/2019
# Changes:
# - Added Warn/Crit thresholds to performance output
#
# by Danny van Zunderd / danny_vz@live.nl

iostat=`which iostat 2>/dev/null`
bc=`which bc 2>/dev/null`

function help {
echo -e "
Usage:

-d =
--Device to be checked. Example: \"-d sda\"

Run only one of i, q, W:

-i = IO Check Mode
--Checks Total Transfers/sec, Read IO/Sec, Write IO/Sec, Bytes Read/Sec, Bytes Written/Sec
--warning/critical = Total Transfers/sec,Read IO/Sec,Write IO/Sec,Bytes Read/Sec,Bytes Written/Sec

-q = Queue Mode
--Checks Disk Queue Lengths
--warning/critial = Average size of requests, Queue length of requests

-W = Wait Time Mode
--Check the time for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
--warning/critical = Avg I/O Wait Time (ms), Avg Read Wait Time (ms), Avg Write Wait Time (ms), Avg Service Wait Time (ms), Avg CPU Utilization

-w,-c = pass warning and critical levels respectively. These are not required, but with out them, all queries will return as OK.

-p = Provide performance data for later graphing

-g = Since last reboot for system (more for debugging that nagios use!)

-h = This help
"
exit -1
}

# Ensuring we have the needed tools:
( [ ! -f $iostat ] || [ ! -f $bc ] ) && \
( echo "ERROR: You must have iostat and bc installed in order to run this plugin\n\tuse: apt-get install systat bc\n" && exit -1 )

io=0
queue=0
waittime=0
printperfdata=0
STATE="OK"
samples=2i
status=0

MSG=""
PERFDATA=""

#------------Argument Set-------------

while getopts "d:w:c:ipqWhg" OPT; do
case $OPT in
"d") disk=$OPTARG;;
"w") warning=$OPTARG;;
"c") critical=$OPTARG;;
"i") io=1;;
"p") printperfdata=1;;
"q") queue=1;;
"W") waittime=1;;
"g") samples=1;;
"h") echo "help:" && help;;
\?) echo "Invalid option: -$OPTARG" >&2
exit -1
;;
esac
done

# Autofill if parameters are empty
if [ -z "$disk" ]
then disk=sda
fi

#Checks that only one query type is run
[[ `expr $io+$queue+$waittime` -ne "1" ]] && \
echo "ERROR: select one and only one run mode" && help

#set warning and critical to insane value is empty, else set the individual values
if [ -z "$warning" ]
then
warning=99999
else
#TPS with IO, Request size with queue
warn_1=`echo $warning | cut -d, -f1`
#Read/s with IO,Queue Length with queue
warn_2=`echo $warning | cut -d, -f2`
#Write/s with IO
warn_3=`echo $warning | cut -d, -f3`
#KB/s read with IO
warn_4=`echo $warning | cut -d, -f4`
#KB/s written with IO
warn_5=`echo $warning | cut -d, -f5`
#Crude hack due to integer expression later in the script
warning=1
fi

if [ -z "$critical" ]
then
critical=99999
else
#TPS with IO, Request size with queue
crit_1=`echo $critical | cut -d, -f1`
#Read/s with IO,Queue Length with queue
crit_2=`echo $critical | cut -d, -f2`
#Write/s with IO
crit_3=`echo $critical | cut -d, -f3`
#KB/s read with IO
crit_4=`echo $critical | cut -d, -f4`
#KB/s written with IO
crit_5=`echo $critical | cut -d, -f5`
#Crude hack due to integer expression later in the script
critical=1
fi

#------------Argument Set End-------------

#------------Parameter Check-------------

#Checks for sane Disk name:
[ ! -b "/dev/$disk" ] && echo "ERROR: Device incorrectly specified" && help

#Checks for sane warning/critical levels
if ( [[ $warning -ne "99999" ]] || [[ $critical -ne "99999" ]] ); then
if ( [[ "$warn_1" -gt "$crit_1" ]] || [[ "$warn_2" -gt "$crit_2" ]] ); then
echo "ERROR: critical levels must be higher than warning levels" && help
elif ( [[ $io -eq "1" ]] || [[ $waittime -eq "1" ]] ); then
if ( [[ "$warn_3" -gt "$crit_3" ]] || [[ "$warn_4" -gt "$crit_4" ]] || [[ "$warn_5" -gt "$crit_5" ]] ); then
echo "ERROR: critical levels must be higher than warning levels" && help
fi
fi
fi

#------------Parameter Check End-------------

# iostat parameters:
# -m: megabytes
# -k: kilobytes
# first run of iostat shows statistics since last reboot, second one shows current vaules of hdd
# -d is the duration for second run, -x the rest

TMPX=`$iostat $disk -x -k -d 10 $samples | grep $disk | tail -1`

#------------IO Test-------------

if [ "$io" == "1" ]; then

TMPD=`$iostat $disk -k -d 10 $samples | grep $disk | tail -1`
#Requests per second:
tps=`echo "$TMPD" | awk '{print $2}'`
read_sec=`echo "$TMPX" | awk '{print $4}'`
written_sec=`echo "$TMPX" | awk '{print $5}'`

#Kb per second:
kbytes_read_sec=`echo "$TMPX" | awk '{print $6}'`
kbytes_written_sec=`echo "$TMPX" | awk '{print $7}'`

# "Converting" values to float (string replace , with .)
tps=${tps/,/.}
read_sec=${read_sec/,/.}
written_sec=${written_sec/,/.}
kbytes_read_sec=${kbytes_read_sec/,/.}
kbytes_written_sec=${kbytes_written_sec/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$tps >= $warn_1" | bc`" == "1" ] || [ "`echo "$read_sec >= $warn_2" | bc`" == "1" ] || \
[ "`echo "$written_sec >= $warn_3" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $warn_4" | bc -q`" == "1" ] ||
[ "`echo "$kbytes_written_sec >= $warn_5" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$tps >= $crit_1" | bc`" == "1" ] || [ "`echo "$read_sec >= $crit_2" | bc -q`" == "1" ] || \
[ "`echo "$written_sec >= $crit_3" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $crit_4" | bc -q`" == "1" ] || \
[ "`echo "$kbytes_written_sec >= $crit_5" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi
# Printing the results:
MSG="$STATE - I/O stats: Transfers/Sec=$tps Read Requests/Sec=$read_sec Write Requests/Sec=$written_sec KBytes Read/Sec=$kbytes_read_sec KBytes_Written/Sec=$kbytes_written_sec"
PERFDATA=" | total_io_sec'=$tps;$warn_1;$crit_1; read_io_sec=$read_sec;$warn_2;$crit_2; write_io_sec=$written_sec;$warn_3;$crit_3; kbytes_read_sec=$kbytes_read_sec;$warn_4;$crit_4; kbytes_written_sec=$kbytes_written_sec;$warn_5;$crit_5;"
fi

#------------IO Test End-------------

#------------Queue Test-------------
if [ "$queue" == "1" ]; then
qsize=`echo "$TMPX" | awk '{print $8}'`
qlength=`echo "$TMPX" | awk '{print $9}'`

# "Converting" values to float (string replace , with .)
qsize=${qsize/,/.}
qlength=${qlength/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$qsize >= $warn_1" | bc`" == "1" ] || [ "`echo "$qlength >= $warn_2" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$qsize >= $crit_1" | bc`" == "1" ] || [ "`echo "$qlength >= $crit_2" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi

# Printing the results:
MSG="$STATE - Disk Queue Stats: Average Request Size=$qsize Average Queue Length=$qlength"
PERFDATA=" | qsize=$qsize;$warn_1;$crit_1; queue_length=$qlength;$warn_2;$crit_2;"
fi

#------------Queue Test End-------------

#------------Wait Time Test-------------

#Parse values. Warning - svc time will soon be deprecated and these will need to be changed. Future parser could look at first line (labels) to suggest correct column to return
if [ "$waittime" == "1" ]; then
avgwait=`echo "$TMPX" | awk '{print $10}'`
avgrwait=`echo "$TMPX" | awk '{print $11}'`
avgwwait=`echo "$TMPX" | awk '{print $12}'`
avgsvctime=`echo "$TMPX" | awk '{print $13}'`
avgcpuutil=`echo "$TMPX" | awk '{print $14}'`

# "Converting" values to float (string replace , with .)
avgwait=${avgwait/,/.}
avgrwait=${avgrwait/,/.}
avgwwait=${avgwwait/,/.}
avgsvctime=${avgsvctime/,/.}
avgcpuutil=${avgcpuutil/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$avgwait >= $warn_1" | bc`" == "1" ] || [ "`echo "$avgrwait >= $warn_2" | bc -q`" == "1" ] || \
[ "`echo "$avgwwait >= $warn_3" | bc`" == "1" ] || [ "`echo "$avgsvctime >= $warn_4" | bc -q`" == "1" ] || \
[ "`echo "$avgcpuutil >= $warn_5" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$avgwait >= $crit_1" | bc`" == "1" ] || [ "`echo "$avgrwait >= $crit_2" | bc -q`" == "1" ] || \
[ "`echo "$avgwwait >= $crit_3" | bc`" == "1" ] || [ "`echo "$avgsvctime >= $crit_4" | bc -q`" == "1" ] || \
[ "`echo "$avgcpuutil >= $crit_5" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi

# Printing the results:
MSG="$STATE - Wait Time Stats: Avg I/O Wait Time (ms)=$avgwait Avg Read Wait Time (ms)=$avgrwait Avg Write Wait Time (ms)=$avgwwait Avg Service Wait Time (ms)=$avgsvctime Avg CPU Utilization=$avgcpuutil"
PERFDATA=" | avg_io_waittime_ms=$avgwait;$warn_1;$crit_1; avg_r_waittime_ms=$avgrwait;$warn_2;$crit_2; avg_w_waittime_ms=$avgwwait;$warn_3;$crit_3; avg_service_waittime_ms=$avgsvctime;$warn_4;$crit_4; avg_cpu_utilization=$avgcpuutil;$warn_5;$crit_5;"
fi

#------------Wait Time End-------------

# now output the official result
echo -n "$MSG"
if [ "x$printperfdata" == "x1" ]; then echo -n "$PERFDATA"; fi
echo ""
exit $status
#----------/check_iostat.sh-----------
bysavv3, September 24, 2014
Fixed performance data for Wait check. Wasn't displaying any data.

#!/bin/bash
#----------check_iostat.sh-----------
#
# Version 0.0.2 - Jan/2009
# Changes: added device verification
#
# by Thiago Varela - thiago@iplenix.com
#
# Version 0.0.3 - Dec/2011
# Changes:
# - changed values from bytes to mbytes
# - fixed bug to get traffic data without comma but point
# - current values are displayed now, not average values (first run of iostat)
#
# by Philipp Niedziela - pn@pn-it.com
#
# Version 0.0.4 - April/2014
# Changes:
# - Allow Empty warn/crit levels
# - Can check I/O, WAIT Time, or Queue
#
# by Warren Turner
#
# Version 0.0.5 - Jun/2014
# Changes:
# - removed -y flag from call since iostat doesn't know about it any more (June 2014)
# - only needed executions of iostat are done now (save cpu time whenever you can)
# - fixed the obvious problems of missing input values (probably because of the now unimplemented "-y") with -x values
# - made perfomance data optional (I like to have choice in the matter)
#
# by Frederic Krueger / fkrueger-dev-checkiostat@holics.at
#
# Version 0.0.6 - Jul/2014
# Changes:
# - Cleaned up argument checking, removed excess iostat calls, steamlined if statements and renamed variables to fit current use
# - Fixed all inputs to match current iostat output (Ubuntu 12.04)
# - Changed to take last ten seconds as default (more useful for nagios usage). Will go to "since last reboot" (previous behaviour) on -g flag.
# - added extra comments/whitespace etc to make add readability
#
# by Ben Field / ben.field@concreteplatform.com
#
# Version 0.0.7 - Sep/2014
# Changes:
# - Fixed performance data for Wait check
#
# by Christian Westergard / christian.westergard@gmail.com
#


iostat=`which iostat 2>/dev/null`
bc=`which bc 2>/dev/null`

function help {
echo -e "
Usage:

-d =
--Device to be checked. Example: \"-d sda\"

Run only one of i, q, W:

-i = IO Check Mode
--Checks Total Transfers/sec, Read IO/Sec, Write IO/Sec, Bytes Read/Sec, Bytes Written/Sec
--warning/critical = Total Transfers/sec,Read IO/Sec,Write IO/Sec,Bytes Read/Sec,Bytes Written/Sec

-q = Queue Mode
--Checks Disk Queue Lengths
--warning/critial = Average size of requests, Queue length of requests

-W = Wait Time Mode
--Check the time for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
--warning/critical = Avg I/O Wait Time (ms), Avg Read Wait Time (ms), Avg Write Wait Time (ms), Avg Service Wait Time (ms), Avg CPU Utilization

-w,-c = pass warning and critical levels respectively. These are not required, but with out them, all queries will return as OK.

-p = Provide performance data for later graphing

-g = Since last reboot for system (more for debugging that nagios use!)

-h = This help
"
exit -1
}

# Ensuring we have the needed tools:
( [ ! -f $iostat ] || [ ! -f $bc ] ) && \
( echo "ERROR: You must have iostat and bc installed in order to run this plugin\n\tuse: apt-get install systat bc\n" && exit -1 )

io=0
queue=0
waittime=0
printperfdata=0
STATE="OK"
samples=2i
status=0

MSG=""
PERFDATA=""

#------------Argument Set-------------

while getopts "d:w:c:ipqWhg" OPT; do
case $OPT in
"d") disk=$OPTARG;;
"w") warning=$OPTARG;;
"c") critical=$OPTARG;;
"i") io=1;;
"p") printperfdata=1;;
"q") queue=1;;
"W") waittime=1;;
"g") samples=1;;
"h") echo "help:" && help;;
\?) echo "Invalid option: -$OPTARG" >&2
exit -1
;;
esac
done

# Autofill if parameters are empty
if [ -z "$disk" ]
then disk=sda
fi

#Checks that only one query type is run
[[ `expr $io+$queue+$waittime` -ne "1" ]] && \
echo "ERROR: select one and only one run mode" && help

#set warning and critical to insane value is empty, else set the individual values
if [ -z "$warning" ]
then warning=99999
else
#TPS with IO, Request size with queue
warn_1=`echo $warning | cut -d, -f1`
#Read/s with IO,Queue Length with queue
warn_2=`echo $warning | cut -d, -f2`
#Write/s with IO
warn_3=`echo $warning | cut -d, -f3`
#KB/s read with IO
warn_4=`echo $warning | cut -d, -f4`
#KB/s written with IO
warn_5=`echo $warning | cut -d, -f5`
#Crude hack due to integer expression later in the script
warning=1
fi

if [ -z "$critical" ]
then critical=99999
else
#TPS with IO, Request size with queue
crit_1=`echo $critical | cut -d, -f1`
#Read/s with IO,Queue Length with queue
crit_2=`echo $critical | cut -d, -f2`
#Write/s with IO
crit_3=`echo $critical | cut -d, -f3`
#KB/s read with IO
crit_4=`echo $critical | cut -d, -f4`
#KB/s written with IO
crit_5=`echo $critical | cut -d, -f5`
#Crude hack due to integer expression later in the script
critical=1
fi

#------------Argument Set End-------------

#------------Parameter Check-------------

#Checks for sane Disk name:
[ ! -b "/dev/$disk" ] && echo "ERROR: Device incorrectly specified" && help

#Checks for sane warning/critical levels
if ( [[ $warning -ne "99999" ]] || [[ $critical -ne "99999" ]] ); then
if ( [[ "$warn_1" -gt "$crit_1" ]] || [[ "$warn_2" -gt "$crit_2" ]] ); then
echo "ERROR: critical levels must be higher than warning levels" && help
elif ( [[ $io -eq "1" ]] || [[ $waittime -eq "1" ]] ); then
if ( [[ "$warn_3" -gt "$crit_3" ]] || [[ "$warn_4" -gt "$crit_4" ]] || [[ "$warn_5" -gt "$crit_5" ]] ); then
echo "ERROR: critical levels must be higher than warning levels" && help
fi
fi
fi

#------------Parameter Check End-------------

# iostat parameters:
# -m: megabytes
# -k: kilobytes
# first run of iostat shows statistics since last reboot, second one shows current vaules of hdd
# -d is the duration for second run, -x the rest

TMPX=`$iostat $disk -x -k -d 10 $samples | grep $disk | tail -1`

#------------IO Test-------------

if [ "$io" == "1" ]; then

TMPD=`$iostat $disk -k -d 10 $samples | grep $disk | tail -1`
#Requests per second:
tps=`echo "$TMPD" | awk '{print $2}'`
read_sec=`echo "$TMPX" | awk '{print $4}'`
written_sec=`echo "$TMPX" | awk '{print $5}'`

#Kb per second:
kbytes_read_sec=`echo "$TMPX" | awk '{print $6}'`
kbytes_written_sec=`echo "$TMPX" | awk '{print $7}'`

# "Converting" values to float (string replace , with .)
tps=${tps/,/.}
read_sec=${read_sec/,/.}
written_sec=${written_sec/,/.}
kbytes_read_sec=${kbytes_read_sec/,/.}
kbytes_written_sec=${kbytes_written_sec/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$tps >= $warn_1" | bc`" == "1" ] || [ "`echo "$read_sec >= $warn_2" | bc`" == "1" ] || \
[ "`echo "$written_sec >= $warn_3" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $warn_4" | bc -q`" == "1" ] ||
[ "`echo "$kbytes_written_sec >= $warn_5" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$tps >= $crit_1" | bc`" == "1" ] || [ "`echo "$read_sec >= $crit_2" | bc -q`" == "1" ] || \
[ "`echo "$written_sec >= $crit_3" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $crit_4" | bc -q`" == "1" ] || \
[ "`echo "$kbytes_written_sec >= $crit_5" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi
# Printing the results:
MSG="$STATE - I/O stats: Transfers/Sec=$tps Read Requests/Sec=$read_sec Write Requests/Sec=$written_sec KBytes Read/Sec=$kbytes_read_sec KBytes_Written/Sec=$kbytes_written_sec"
PERFDATA=" | total_io_sec'=$tps; read_io_sec=$read_sec; write_io_sec=$written_sec; kbytes_read_sec=$kbytes_read_sec; kbytes_written_sec=$kbytes_written_sec;"
fi

#------------IO Test End-------------

#------------Queue Test-------------
if [ "$queue" == "1" ]; then
qsize=`echo "$TMPX" | awk '{print $8}'`
qlength=`echo "$TMPX" | awk '{print $9}'`

# "Converting" values to float (string replace , with .)
qsize=${qsize/,/.}
qlength=${qlength/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$qsize >= $warn_1" | bc`" == "1" ] || [ "`echo "$qlength >= $warn_2" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$qsize >= $crit_1" | bc`" == "1" ] || [ "`echo "$qlength >= $crit_2" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi


# Printing the results:
MSG="$STATE - Disk Queue Stats: Average Request Size=$qsize Average Queue Length=$qlength"
PERFDATA=" | qsize=$qsize; queue_length=$qlength;"
fi

#------------Queue Test End-------------

#------------Wait Time Test-------------

#Parse values. Warning - svc time will soon be deprecated and these will need to be changed. Future parser could look at first line (labels) to suggest correct column to return
if [ "$waittime" == "1" ]; then
avgwait=`echo "$TMPX" | awk '{print $10}'`
avgrwait=`echo "$TMPX" | awk '{print $11}'`
avgwwait=`echo "$TMPX" | awk '{print $12}'`
avgsvctime=`echo "$TMPX" | awk '{print $13}'`
avgcpuutil=`echo "$TMPX" | awk '{print $14}'`

# "Converting" values to float (string replace , with .)
avgwait=${avgwait/,/.}
avgrwait=${avgrwait/,/.}
avgwwait=${avgwwait/,/.}
avgsvctime=${avgsvctime/,/.}
avgcpuutil=${avgcpuutil/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$avgwait >= $warn_1" | bc`" == "1" ] || [ "`echo "$avgrwait >= $warn_2" | bc -q`" == "1" ] || \
[ "`echo "$avgwwait >= $warn_3" | bc`" == "1" ] || [ "`echo "$avgsvctime >= $warn_4" | bc -q`" == "1" ] || \
[ "`echo "$avgcpuutil >= $warn_5" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$avgwait >= $crit_1" | bc`" == "1" ] || [ "`echo "$avgrwait >= $crit_2" | bc -q`" == "1" ] || \
[ "`echo "$avgwwait >= $crit_3" | bc`" == "1" ] || [ "`echo "$avgsvctime >= $crit_4" | bc -q`" == "1" ] || \
[ "`echo "$avgcpuutil >= $crit_5" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi

# Printing the results:
MSG="$STATE - Wait Time Stats: Avg I/O Wait Time (ms)=$avgwait Avg Read Wait Time (ms)=$avgrwait Avg Write Wait Time (ms)=$avgwwait Avg Service Wait Time (ms)=$avgsvctime Avg CPU Utilization=$avgcpuutil"
PERFDATA=" | avg_io_waittime_ms=$avgwait; avg_r_waittime_ms=$avgrwait; avg_w_waittime_ms=$avgwwait; avg_service_waittime_ms=$avgsvctime; avg_cpu_utilization=$avgcpuutil;"
fi

#------------Wait Time End-------------

# now output the official result
echo -n "$MSG"
if [ "x$printperfdata" == "x1" ]; then echo -n "$PERFDATA"; fi
echo ""
exit $status
#----------/check_iostat.sh-----------
I have changed the script to work with the above system and cleaned it up a fair amount. Someone might want to have a look at parsing the inputs using the column names rather than column numbers in the future:

#!/bin/bash
#----------check_iostat.sh-----------
#
# Version 0.0.2 - Jan/2009
# Changes: added device verification
#
# by Thiago Varela - thiago@iplenix.com
#
# Version 0.0.3 - Dec/2011
# Changes:
# - changed values from bytes to mbytes
# - fixed bug to get traffic data without comma but point
# - current values are displayed now, not average values (first run of iostat)
#
# by Philipp Niedziela - pn@pn-it.com
#
# Version 0.0.4 - April/2014
# Changes:
# - Allow Empty warn/crit levels
# - Can check I/O, WAIT Time, or Queue
#
# by Warren Turner
#
# Version 0.0.5 - Jun/2014
# Changes:
# - removed -y flag from call since iostat doesn't know about it any more (June 2014)
# - only needed executions of iostat are done now (save cpu time whenever you can)
# - fixed the obvious problems of missing input values (probably because of the now unimplemented "-y") with -x values
# - made perfomance data optional (I like to have choice in the matter)
#
# by Frederic Krueger / fkrueger-dev-checkiostat@holics.at
#
# Version 0.0.6 - Jul/2014
# Changes:
# - Cleaned up argument checking, removed excess iostat calls, steamlined if statements and renamed variables to fit current use
# - Fixed all inputs to match current iostat output (Ubuntu 12.04)
# - Changed to take last ten seconds as default (more useful for nagios usage). Will go to "since last reboot" (previous behaviour) on -g flag.
# - added extra comments/whitespace etc to make add readability
#
# by Ben Field / ben.field@concreteplatform.com

iostat=`which iostat 2>/dev/null`
bc=`which bc 2>/dev/null`

function help {
echo -e "
Usage:

-d =
--Device to be checked. Example: \"-d sda\"

Run only one of i, q, W:

-i = IO Check Mode
--Checks Total Transfers/sec, Read IO/Sec, Write IO/Sec, Bytes Read/Sec, Bytes Written/Sec
--warning/critical = Total Transfers/sec,Read IO/Sec,Write IO/Sec,Bytes Read/Sec,Bytes Written/Sec

-q = Queue Mode
--Checks Disk Queue Lengths
--warning/critial = Average size of requests, Queue length of requests

-W = Wait Time Mode
--Check the time for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
--warning/critical = Avg I/O Wait Time (ms), Avg Read Wait Time (ms), Avg Write Wait Time (ms), Avg Service Wait Time (ms), Avg CPU Utilization

-w,-c = pass warning and critical levels respectively. These are not required, but with out them, all queries will return as OK.

-p = Provide performance data for later graphing

-g = Since last reboot for system (more for debugging that nagios use!)

-h = This help
"
exit -1
}

# Ensuring we have the needed tools:
( [ ! -f $iostat ] || [ ! -f $bc ] ) && \
( echo "ERROR: You must have iostat and bc installed in order to run this plugin\n\tuse: apt-get install systat bc\n" && exit -1 )

io=0
queue=0
waittime=0
printperfdata=0
STATE="OK"
samples=2i
status=0

MSG=""
PERFDATA=""

#------------Argument Set-------------

while getopts "d:w:c:ipqWhg" OPT; do
case $OPT in
"d") disk=$OPTARG;;
"w") warning=$OPTARG;;
"c") critical=$OPTARG;;
"i") io=1;;
"p") printperfdata=1;;
"q") queue=1;;
"W") waittime=1;;
"g") samples=1;;
"h") echo "help:" && help;;
\?) echo "Invalid option: -$OPTARG" >&2
exit -1
;;
esac
done

# Autofill if parameters are empty
if [ -z "$disk" ]
then disk=sda
fi

#Checks that only one query type is run
[[ `expr $io+$queue+$waittime` -ne "1" ]] && \
echo "ERROR: select one and only one run mode" && help

#set warning and critical to insane value is empty, else set the individual values
if [ -z "$warning" ]
then warning=99999
else
#TPS with IO, Request size with queue
warn_1=`echo $warning | cut -d, -f1`
#Read/s with IO,Queue Length with queue
warn_2=`echo $warning | cut -d, -f2`
#Write/s with IO
warn_3=`echo $warning | cut -d, -f3`
#KB/s read with IO
warn_4=`echo $warning | cut -d, -f4`
#KB/s written with IO
warn_5=`echo $warning | cut -d, -f5`
#Crude hack due to integer expression later in the script
warning=1
fi

if [ -z "$critical" ]
then critical=99999
else
#TPS with IO, Request size with queue
crit_1=`echo $critical | cut -d, -f1`
#Read/s with IO,Queue Length with queue
crit_2=`echo $critical | cut -d, -f2`
#Write/s with IO
crit_3=`echo $critical | cut -d, -f3`
#KB/s read with IO
crit_4=`echo $critical | cut -d, -f4`
#KB/s written with IO
crit_5=`echo $critical | cut -d, -f5`
#Crude hack due to integer expression later in the script
critical=1
fi

#------------Argument Set End-------------

#------------Parameter Check-------------

#Checks for sane Disk name:
[ ! -b "/dev/$disk" ] && echo "ERROR: Device incorrectly specified" && help

#Checks for sane warning/critical levels
if ( [[ $warning -ne "99999" ]] || [[ $critical -ne "99999" ]] ); then
if ( [[ "$warn_1" -gt "$crit_1" ]] || [[ "$warn_2" -gt "$crit_2" ]] ); then
echo "ERROR: critical levels must be higher than warning levels" && help
elif ( [[ $io -eq "1" ]] || [[ $waittime -eq "1" ]] ); then
if ( [[ "$warn_3" -gt "$crit_3" ]] || [[ "$warn_4" -gt "$crit_4" ]] || [[ "$warn_5" -gt "$crit_5" ]] ); then
echo "ERROR: critical levels must be higher than warning levels" && help
fi
fi
fi

#------------Parameter Check End-------------

# iostat parameters:
# -m: megabytes
# -k: kilobytes
# first run of iostat shows statistics since last reboot, second one shows current vaules of hdd
# -d is the duration for second run, -x the rest

TMPX=`$iostat $disk -x -k -d 10 $samples | grep $disk | tail -1`

#------------IO Test-------------

if [ "$io" == "1" ]; then

TMPD=`$iostat $disk -k -d 10 $samples | grep $disk | tail -1`
#Requests per second:
tps=`echo "$TMPD" | awk '{print $2}'`
read_sec=`echo "$TMPX" | awk '{print $4}'`
written_sec=`echo "$TMPX" | awk '{print $5}'`

#Kb per second:
kbytes_read_sec=`echo "$TMPX" | awk '{print $6}'`
kbytes_written_sec=`echo "$TMPX" | awk '{print $7}'`

# "Converting" values to float (string replace , with .)
tps=${tps/,/.}
read_sec=${read_sec/,/.}
written_sec=${written_sec/,/.}
kbytes_read_sec=${kbytes_read_sec/,/.}
kbytes_written_sec=${kbytes_written_sec/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$tps >= $warn_1" | bc`" == "1" ] || [ "`echo "$read_sec >= $warn_2" | bc`" == "1" ] || \
[ "`echo "$written_sec >= $warn_3" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $warn_4" | bc -q`" == "1" ] ||
[ "`echo "$kbytes_written_sec >= $warn_5" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$tps >= $crit_1" | bc`" == "1" ] || [ "`echo "$read_sec >= $crit_2" | bc -q`" == "1" ] || \
[ "`echo "$written_sec >= $crit_3" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $crit_4" | bc -q`" == "1" ] || \
[ "`echo "$kbytes_written_sec >= $crit_5" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi
# Printing the results:
MSG="$STATE - I/O stats: Transfers/Sec=$tps Read Requests/Sec=$read_sec Write Requests/Sec=$written_sec KBytes Read/Sec=$kbytes_read_sec KBytes_Written/Sec=$kbytes_written_sec"
PERFDATA=" | total_io_sec'=$tps; read_io_sec=$read_sec; write_io_sec=$written_sec; kbytes_read_sec=$kbytes_read_sec; kbytes_written_sec=$kbytes_written_sec;"
fi

#------------IO Test End-------------

#------------Queue Test-------------
if [ "$queue" == "1" ]; then
qsize=`echo "$TMPX" | awk '{print $8}'`
qlength=`echo "$TMPX" | awk '{print $9}'`

# "Converting" values to float (string replace , with .)
qsize=${qsize/,/.}
qlength=${qlength/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$qsize >= $warn_1" | bc`" == "1" ] || [ "`echo "$qlength >= $warn_2" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$qsize >= $crit_1" | bc`" == "1" ] || [ "`echo "$qlength >= $crit_2" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi


# Printing the results:
MSG="$STATE - Disk Queue Stats: Average Request Size=$qsize Average Queue Length=$qlength"
PERFDATA=" | qsize=$qsize; queue_length=$qlength;"
fi

#------------Queue Test End-------------

#------------Wait Time Test-------------

#Parse values. Warning - svc time will soon be deprecated and these will need to be changed. Future parser could look at first line (labels) to suggest correct column to return
if [ "$waittime" == "1" ]; then
avgwait=`echo "$TMPX" | awk '{print $10}'`
avgrwait=`echo "$TMPX" | awk '{print $11}'`
avgwwait=`echo "$TMPX" | awk '{print $12}'`
avgsvctime=`echo "$TMPX" | awk '{print $13}'`
avgcpuutil=`echo "$TMPX" | awk '{print $14}'`

# "Converting" values to float (string replace , with .)
avgwait=${avgwait/,/.}
avgrwait=${avgrwait/,/.}
avgwwait=${avgwwait/,/.}
avgsvctime=${avgsvctime/,/.}
avgcpuutil=${avgcpuutil/,/.}

# Comparing the result and setting the correct level:
if [ "$warning" -ne "99999" ]; then
if ( [ "`echo "$avgwait >= $warn_1" | bc`" == "1" ] || [ "`echo "$avgrwait >= $warn_2" | bc -q`" == "1" ] || \
[ "`echo "$avgwwait >= $warn_3" | bc`" == "1" ] || [ "`echo "$avgsvctime >= $warn_4" | bc -q`" == "1" ] || \
[ "`echo "$avgcpuutil >= $warn_5" | bc`" == "1" ] ); then
STATE="WARNING"
status=1
fi
fi
if [ "$critical" -ne "99999" ]; then
if ( [ "`echo "$avgwait >= $crit_1" | bc`" == "1" ] || [ "`echo "$avgrwait >= $crit_2" | bc -q`" == "1" ] || \
[ "`echo "$avgwwait >= $crit_3" | bc`" == "1" ] || [ "`echo "$avgsvctime >= $crit_4" | bc -q`" == "1" ] || \
[ "`echo "$avgcpuutil >= $crit_5" | bc`" == "1" ] ); then
STATE="CRITICAL"
status=2
fi
fi

# Printing the results:
MSG="$STATE - Wait Time Stats: Avg I/O Wait Time (ms)=$avgwait Avg Read Wait Time (ms)=$avgrwait Avg Write Wait Time (ms)=$avgwwait Avg Service Wait Time (ms)=$avgsvctime Avg CPU Utilization=$avgcpuutil"
PERFDATA=" | avg_io_waittime_ms=$avgiotime; avg_r_waittime_ms=$avgiotime; avg_w_waittime_ms=$avgiotime; avg_service_waittime_ms=$avgsvctime; avg_cpu_utilization=$avgcpuutil;"
fi

#------------Wait Time End-------------

# now output the official result
echo -n "$MSG"
if [ "x$printperfdata" == "x1" ]; then echo -n "$PERFDATA"; fi
echo ""
exit $status
#----------/check_iostat.sh-----------
Hi,

I had to do a few fixes and some (minor) clearing up compared to the 0.0.4 version posted here.

The plugin works again now.. as for SElinux, I will find out once I created an RPM for our environment and do a testing rollout :-)

Regards,
Frederic


----------check_iostat.sh-----------
#!/bin/bash
#
# Version 0.0.2 - Jan/2009
# Changes: added device verification
#
# by Thiago Varela - thiago@iplenix.com
#
# --------------------------------------
#
# Version 0.0.3 - Dec/2011
# Changes:
# - changed values from bytes to mbytes
# - fixed bug to get traffic data without comma but point
# - current values are displayed now, not average values (first run of iostat)
#
# by Philipp Niedziela - pn@pn-it.com
#
# Version 0.0.4 - April/2014
# Changes:
# - Allow Empty warn/crit levels
# - Can check I/O, WAIT Time, or Queue
#
# by Warren Turner
#
# Version 0.0.5 - Jun/2014
# Changes:
# - removed -y flag from call since iostat doesn't know about it any more (June 2014)
# - only needed executions of iostat are done now (save cpu time whenever you can)
# - fixed the obvious problems of missing input values (probably because of the now unimplemented "-y") with -x values
# - made perfomance data optional (I like to have choice in the matter)
#
# by Frederic Krueger / fkrueger-dev-checkiostat@holics.at
#

iostat=`which iostat 2>/dev/null`
bc=`which bc 2>/dev/null`

function help {
echo -e "
Usage:

-d =
--Device to be checked. Example: \"-d sda\"

-i = IO Check Mode
--Checks Total Disk IO, Read IO/Sec, Write IO/Sec, Bytes Read/Sec, Bytes Written/Sec
--warning/critical = Total IO,Read IO/Sec,Write IO/Sec,Bytes Read/Sec,Bytes Written/Sec

-q = Queue Mode
--Checks Disk Queue Lengths
--warning/critial = Total Queue Length,Read Queue Length,Write Queue Length

-W = Wait Time Mode
--Check the time for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
--warning/critical = Avg I/O Wait Time/ms,Read Wait Time/ms,Write Wait Time/ms

-p = Provide performance data for later graphing

-h = This help
"
exit -1
}

# Ensuring we have the needed tools:
( [ ! -f $iostat ] || [ ! -f $bc ] ) && \
( echo "ERROR: You must have iostat and bc installed in order to run this plugin\n\tuse: apt-get install systat bc\n" && exit -1 )

io=0
queue=0
waittime=0
printperfdata=0
STATE="OK"

MSG=""
PERFDATA=""


# Getting parameters:
while getopts "d:w:c:io:pqu:Wt:h" OPT; do
case $OPT in
"d") disk=$OPTARG;;
"w") warning=$OPTARG;;
"c") critical=$OPTARG;;
"i") io=1;;
"p") printperfdata=1;;
"q") queue=1;;
"W") waittime=1;;
"h") help;;
esac
done

# Autofill if parameters are empty
if [ -z "$disk" ]
then disk=sda
fi

if [ -z "$warning" ]
then warning=99999
fi

if [ -z "$critical" ]
then critical=99999
fi


# Adjusting the warn and crit levels:
crit_total=`echo $critical | cut -d, -f1`
crit_read=`echo $critical | cut -d, -f2`
crit_written=`echo $critical | cut -d, -f3`
crit_kbytes_read=`echo $critical | cut -d, -f4`
crit_kbytes_written=`echo $critical | cut -d, -f5`

warn_total=`echo $warning | cut -d, -f1`
warn_read=`echo $warning | cut -d, -f2`
warn_written=`echo $warning | cut -d, -f3`
warn_kbytes_read=`echo $warning | cut -d, -f4`
warn_kbytes_written=`echo $warning | cut -d, -f5`


## # Checking parameters:
# [ ! -b "/dev/$disk" ] && echo "ERROR: Device incorrectly specified" && help

# ( [ "$warn_total" == "" ] || [ "$warn_read" == "" ] || [ "$warn_written" == "" ] || \
# [ "$crit_total" == "" ] || [ "$crit_read" == "" ] || [ "$crit_written" == "" ] ) &&
# echo "ERROR: You must specify all warning and critical levels" && help

# ( [[ "$warn_total" -ge "$crit_total" ]] || \
# [[ "$warn_read" -ge "$crit_read" ]] || \
# [[ "$warn_written" -ge "$crit_written" ]] ) && \
# echo "ERROR: critical levels must be highter than warning levels" && help


# iostat parameters:
# -m: megabytes
# -k: kilobytes
# first run of iostat shows statistics since last reboot, second one shows current vaules of hdd

# Doing the actual checks:


# -d has the total per second, -x the rest
TMPD=`$iostat $disk -k -d 2 1 | grep $disk`
TMPX=`$iostat $disk -x -d 2 1 | grep $disk`

## IO Check ##
if [ "$io" == "1" ]
then
total=`echo "$TMPD" | awk '{print $2}'`
read_sec=`echo "$TMPX" | awk '{print $4}'`
written_sec=`echo "$TMPX" | awk '{print $5}'`
kbytes_read_sec=`echo "$TMPD" | awk '{print $6}'`
kbytes_written_sec=`echo "$TMPD" | awk '{print $7}'`

# IO # "Converting" values to float (string replace , with .)
total=${total/,/.}
read_sec=${read_sec/,/.}
written_sec=${written_sec/,/.}
kbytes_read_sec=${kbytes_read_sec/,/.}
kbytes_written_sec=${kbytes_written_sec/,/.}

# IO # Comparing the result and setting the correct level:

if [ "$warn_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $warn_total" | bc`" == "1" ] || [ "`echo "$read_sec >= $warn_read" | bc`" == "1" ] || \
[ "`echo "$written_sec >= $warn_written" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $warn_kbytes_read" | bc -q`" == "1" ] ||
[ "`echo "$kbytes_written_sec >= $warn_kybtes_written" | bc`" == "1" ] )
then
STATE="WARNING"
status=1
fi
fi

if [ "$crit_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $crit_total" | bc`" == "1" ] || [ "`echo "$read_sec >= $crit_read" | bc -q`" == "1" ] || \
[ "`echo "$written_sec >= $crit_written" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $crit_kbytes_read" | bc -q`" == "1" ] || \
[ "`echo "$kbytes_written_sec >= $crit_kbytes_written" | bc`" == "1" ] )
then
STATE="CRITICAL"
status=2
fi
fi

if [ "$crit_total" == "99999" ] && [ "$warn_total" == "99999" ]
then
STATE="OK"
status=0

fi

# IO # Printing the results:
MSG="$STATE - I/O stats: Total IO/Sec=$total Read IO/Sec=$read_sec Write IO/Sec=$written_sec KBytes Read/Sec=$kbytes_read_sec KBytes_Written/Sec=$kbytes_written_sec"
PERFDATA=" | total_io_sec'=$total; read_io_sec=$read_sec; write_io_sec=$written_sec; kbytes_read_sec=$kbytes_read_sec; kbytes_written_sec=$kbytes_written_sec;"

fi


## QUEUE Check ##
if [ "$queue" == "1" ]
then
total=`echo "$TMPX" | awk '{print $8}'`
readq_sec=`echo "$TMPX" | awk '{print $6}'`
writtenq_sec=`echo "$TMPX" | awk '{print $7}'`

# QUEUE # "Converting" values to float (string replace , with .)
total=${total/,/.}
readq_sec=${readq_sec/,/.}
writtenq_sec=${writtenq_sec/,/.}


# QUEUE # Comparing the result and setting the correct level:

if [ "$warn_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $warn_total" | bc`" == "1" ] || [ "`echo "$readq_sec >= $warn_read" | bc`" == "1" ] || \
[ "`echo "$writtenq_sec >= $warn_written" | bc`" == "1" ] )
then
STATE="WARNING"
status=1
fi
fi

if [ "$crit_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $crit_total" | bc`" == "1" ] || [ "`echo "$readq_sec >= $crit_read" | bc -q`" == "1" ] || \
[ "`echo "$writtenq_sec >= $crit_written" | bc`" == "1" ] )
then
STATE="CRITICAL"
status=2
fi
fi

if [ "$crit_total" == "99999" ] && [ "$warn_total" == "99999" ]
then
STATE="OK"
status=0

fi

# QUEUE # Printing the results:
MSG="$STATE - Disk Queue Stats: Average Queue Length=$total Read Queue/Sec=$readq_sec Write Queue/Sec=$writtenq_sec"
PERFDATA=" | total=$total; read_queue_sec=$readq_sec; write_queue_sec=$writtenq_sec;"
fi





## WAIT TIME Check ##
if [ "$waittime" == "1" ]
then
TMP=`$iostat $disk -x -k -d 2 1 | grep $disk`
avgiotime=`echo "$TMP" | awk '{print $10}'`
avgsvctime=`echo "$TMP" | awk '{print $11}'`
avgcpuutil=`echo "$TMP" | awk '{print $12}'`

# QUEUE # "Converting" values to float (string replace , with .)
avgiotime=${avgiotime/,/.}
avgsvctime=${avgsvctime/,/.}
avgcpuutil=${avgcpuutil/,/.}

# WAIT TIME # Comparing the result and setting the correct level:

if [ "$warn_total" -ne "99999" ]
then
if ( [ "`echo "$avgiotime >= $warn_total" | bc`" == "1" ] || [ "`echo "$avgsvctime >= $warn_read" | bc`" == "1" ] || \
[ "`echo "$avgcpuutil >= $warn_written" | bc`" == "1" ] )
then
STATE="WARNING"
status=1
fi
fi

if [ "$crit_total" -ne "99999" ]
then
if ( [ "`echo "$avgiotime >= $crit_total" | bc`" == "1" ] || [ "`echo "$avgsvctime >= $crit_read" | bc -q`" == "1" ] || \
[ "`echo "$avgcpuutil >= $crit_written" | bc`" == "1" ] )
then
STATE="CRITICAL"
status=2
fi
fi

if [ "$crit_total" == "99999" ] && [ "$warn_total" == "99999" ]
then
STATE="OK"
status=0

fi

# WAIT TIME # Printing the results:
MSG="$STATE - Wait Time Stats: Avg I/O Wait Time/ms=$avgiotime Avg Service Wait Time/ms=$avgsvctime Avg CPU Utilization=$avgcpuutil"
PERFDATA=" | avg_io_waittime_ms=$avgiotime; avg_service_waittime_ms=$avgsvctime; avg_cpu_utilization=$avgcpuutil;"
fi

# now output the official result
echo -n "$MSG"
if [ "x$printperfdata" == "x1" ]; then echo -n "$PERFDATA"; fi
echo ""
exit $status
----------/check_iostat.sh-----------
byEndlessTundra, April 25, 2014
Hey Everyone, this script was very nice but it also had some weird irritations so I reworked it and added:

- Allow empty Warning/Critical values
- Added Modes so that you can check Disk IOs, Disk Queue, or Disk Wait Times

- To see the usage information use check_diskio.sh -h

Sorry I don't have this anywhere on the web so I'm just going to paste it here:



#!/bin/bash
#
# Version 0.0.2 - Jan/2009
# Changes: added device verification
#
# by Thiago Varela - thiago@iplenix.com
#
# --------------------------------------
#
# Version 0.0.3 - Dec/2011
# Changes:
# - changed values from bytes to mbytes
# - fixed bug to get traffic data without comma but point
# - current values are displayed now, not average values (first run of iostat)
#
# by Philipp Niedziela - pn@pn-it.com
#
# Version 0.0.4 - April/2014
# Changes:
# - Allow Empty warn/crit levels
# - Can check I/O, WAIT Time, or Queue
#
# by Warren Turner

iostat=`which iostat 2>/dev/null`
bc=`which bc 2>/dev/null`

function help {
echo -e "
Usage:

-d =
--Device to be checked. Example: \"-d sda\"

-i = IO Check Mode
--Checks Total Disk IO, Read IO/Sec, Write IO/Sec, Bytes Read/Sec, Bytes Written/Sec
--warning/critical = Total IO,Read IO/Sec,Write IO/Sec,Bytes Read/Sec,Bytes Written/Sec

-q = Queue Mode
--Checks Disk Queue Lengths
--warning/critial = Total Queue Length,Read Queue Length,Write Queue Length

-W = Wait Time Mode
--Check the time for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
--warning/critical = Avg I/O Wait Time/ms,Read Wait Time/ms,Write Wait Time/ms

"
exit -1
}

# Ensuring we have the needed tools:
( [ ! -f $iostat ] || [ ! -f $bc ] ) && \
( echo "ERROR: You must have iostat and bc installed in order to run this plugin\n\tuse: apt-get install systat bc\n" && exit -1 )

io=0
queue=0
waittime=0
msg="OK"

# Getting parameters:
while getopts "d:w:c:io:qu:Wt:h" OPT; do
case $OPT in
"d") disk=$OPTARG;;
"w") warning=$OPTARG;;
"c") critical=$OPTARG;;
"i") io=1;;
"q") queue=1;;
"W") waittime=1;;
"h") help;;
esac
done

# Autofill if parameters are empty
if [ -z "$disk" ]
then disk=sda
fi

if [ -z "$warning" ]
then warning=99999
fi

if [ -z "$critical" ]
then critical=99999
fi


# Adjusting the warn and crit levels:
crit_total=`echo $critical | cut -d, -f1`
crit_read=`echo $critical | cut -d, -f2`
crit_written=`echo $critical | cut -d, -f3`
crit_kbytes_read=`echo $critical | cut -d, -f4`
crit_kbytes_written=`echo $critical | cut -d, -f5`

warn_total=`echo $warning | cut -d, -f1`
warn_read=`echo $warning | cut -d, -f2`
warn_written=`echo $warning | cut -d, -f3`
warn_kbytes_read=`echo $warning | cut -d, -f4`
warn_kbytes_written=`echo $warning | cut -d, -f5`


# # Checking parameters:
# [ ! -b "/dev/$disk" ] && echo "ERROR: Device incorrectly specified" && help

# ( [ "$warn_total" == "" ] || [ "$warn_read" == "" ] || [ "$warn_written" == "" ] || \
# [ "$crit_total" == "" ] || [ "$crit_read" == "" ] || [ "$crit_written" == "" ] ) &&
# echo "ERROR: You must specify all warning and critical levels" && help

# ( [[ "$warn_total" -ge "$crit_total" ]] || \
# [[ "$warn_read" -ge "$crit_read" ]] || \
# [[ "$warn_written" -ge "$crit_written" ]] ) && \
# echo "ERROR: critical levels must be highter than warning levels" && help


# iostat parameters:
# -m: megabytes
# -k: kilobytes
# first run of iostat shows statistics since last reboot, second one shows current vaules of hdd

# Doing the actual checks:


## IO Check ##
if [ "$io" == "1" ]
then
total=`$iostat $disk -y -k -d 2 1 | grep $disk | awk '{print $2}'`
read_sec=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $4}'`
written_sec=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $5}'`
kbytes_read_sec=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $6}'`
kbytes_written_sec=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $7}'`


# IO # "Converting" values to float (string replace , with .)
total=${total/,/.}
read_sec=${read_sec/,/.}
written_sec=${written_sec/,/.}
kbytes_read_sec=${kbytes_read_sec/,/.}
kbytes_written_sec=${kbytes_written_sec/,/.}


# IO # Comparing the result and setting the correct level:

if [ "$warn_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $warn_total" | bc`" == "1" ] || [ "`echo "$read_sec >= $warn_read" | bc`" == "1" ] || \
[ "`echo "$written_sec >= $warn_written" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $warn_kbytes_read" | bc -q`" == "1" ] ||
[ "`echo "$kbytes_written_sec >= $warn_kybtes_written" | bc`" == "1" ] )
then
msg="WARNING"
status=1
fi
fi

if [ "$crit_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $crit_total" | bc`" == "1" ] || [ "`echo "$read_sec >= $crit_read" | bc -q`" == "1" ] || \
[ "`echo "$written_sec >= $crit_written" | bc`" == "1" ] || [ "`echo "$kbytes_read_sec >= $crit_kbytes_read" | bc -q`" == "1" ] || \
[ "`echo "$kbytes_written_sec >= $crit_kbytes_written" | bc`" == "1" ] )
then
msg="CRITICAL"
status=2
fi
fi

if [ "$crit_total" == "99999" ] && [ "$warn_total" == "99999" ]
then
msg="OK"
status=0

fi

# IO # Printing the results:
echo "$msg - I/O stats: Total IO/Sec=$total Read IO/Sec=$read_sec Write IO/Sec=$written_sec KBytes Read/Sec=$kbytes_read_sec KBytes_Written/Sec=$kbytes_written_sec | 'Total IO/Sec'=$total; 'Read IO/Sec'=$read_sec; 'Write IO/Sec'=$written_sec; 'KBytes Read/Sec'=$kbytes_read_sec; 'KKBytes_Written/Sec'=$kbytes_written_sec;"

fi


## QUEUE Check ##
if [ "$queue" == "1" ]
then
total=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $8}'`
read_sec=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $2}'`
written_sec=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $3}'`


# QUEUE # "Converting" values to float (string replace , with .)
total=${total/,/.}
read_sec=${read_sec/,/.}
written_sec=${written_sec/,/.}



# QUEUE # Comparing the result and setting the correct level:

if [ "$warn_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $warn_total" | bc`" == "1" ] || [ "`echo "$read_sec >= $warn_read" | bc`" == "1" ] || \
[ "`echo "$written_sec >= $warn_written" | bc`" == "1" ] )
then
msg="WARNING"
status=1
fi
fi

if [ "$crit_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $crit_total" | bc`" == "1" ] || [ "`echo "$read_sec >= $crit_read" | bc -q`" == "1" ] || \
[ "`echo "$written_sec >= $crit_written" | bc`" == "1" ] )
then
msg="CRITICAL"
status=2
fi
fi

if [ "$crit_total" == "99999" ] && [ "$warn_total" == "99999" ]
then
msg="OK"
status=0

fi

# QUEUE # Printing the results:
echo "$msg - Disk Queue Stats: Average Queue Length=$total Read Queue/Sec=$read_sec Write Queue/Sec=$written_sec | 'total'=$total; 'Read Queue/Sec'=$read_sec; 'Write Queue/Sec'=$written_sec;"

fi



## WAIT TIME Check ##
if [ "$waittime" == "1" ]
then
total=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $10}'`
read_sec=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $11}'`
written_sec=`$iostat $disk -x -y -k -d 2 1 | grep $disk | awk '{print $12}'`


# QUEUE # "Converting" values to float (string replace , with .)
total=${total/,/.}
read_sec=${read_sec/,/.}
written_sec=${written_sec/,/.}


# WAIT TIME # Comparing the result and setting the correct level:

if [ "$warn_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $warn_total" | bc`" == "1" ] || [ "`echo "$read_sec >= $warn_read" | bc`" == "1" ] || \
[ "`echo "$written_sec >= $warn_written" | bc`" == "1" ] )
then
msg="WARNING"
status=1
fi
fi

if [ "$crit_total" -ne "99999" ]
then
if ( [ "`echo "$total >= $crit_total" | bc`" == "1" ] || [ "`echo "$read_sec >= $crit_read" | bc -q`" == "1" ] || \
[ "`echo "$written_sec >= $crit_written" | bc`" == "1" ] )
then
msg="CRITICAL"
status=2
fi
fi

if [ "$crit_total" == "99999" ] && [ "$warn_total" == "99999" ]
then
msg="OK"
status=0

fi

# WAIT TIME # Printing the results:
echo "$msg - Wait Time Stats: Avg I/O Wait Time/ms=$total Avg Read Wait Time/ms=$read_sec Avg Write Wait Time/ms=$written_sec | 'Avg I/O Wait Time/ms'=$total; 'Avg Read Wait Time/ms'=$read_sec; 'Avg Write Wait Time/ms'=$written_sec;"

fi

exit $status
Hi,

I have posted an updated version of this script here:
http://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/check_iostat--2D-I-2FO-statistics--2D-updated-2014/details

The script fixes the bugs mentioned in other posts also adds await (how long the system spends waiting to wrtie to disk) to the output and added a pnp4nagios graphing template.
byamateo, June 12, 2013
I have created a patched version between original version and philippn's one. This patch:

* Runs iostat just once.
* Avoids the conversion between '.' and ',' by running iostat with LANG=C
* Gets actual values not the ones from last reboot.
* Runs from bash

This is the patch:

Index: check_iostat
===================================================================
--- check_iostat (revisiĆ³n: 11002)
+++ check_iostat (copia de trabajo)
@@ -1,9 +1,20 @@
-#!/bin/sh
+#!/bin/bash
#
# Version 0.0.2 - Jan/2009
# Changes: added device verification
+#
+# by Thiago Varela - thiago@iplenix.com
#
-# by Thiago Varela - thiago@iplenix.com
+# --------------------------------------
+#
+# Version 0.0.3 - Dec/2011
+# Changes:
+# - changed values from bytes to mbytes
+# - fixed bug to get traffic data without comma but point
+# - current values are displayed now, not average values (first run of iostat)
+#
+# by Philipp Niedziela - pn@pn-it.com
+#

iostat=`which iostat 2>/dev/null`
bc=`which bc 2>/dev/null`
@@ -50,14 +61,19 @@
echo "ERROR: critical levels must be highter than warning levels" && help


+# iostat parameters:
+# -m: megabytes
+# -k: kilobytes
+# first run of iostat shows statistics since last reboot, second one shows current vaules of hdd
# Doing the actual check:
-tps=`$iostat $disk | grep $disk | awk '{print $2}'`
-kbread=`$iostat $disk | grep $disk | awk '{print $3}'`
-kbwritten=`$iostat $disk | grep $disk | awk '{print $4}'`
+# We get just 2nd line, which is the actual value
+output=$(LANG=C $iostat $disk -d 1 2 | grep $disk | sed -n '2p')
+tps=$(echo "$output" | awk '{print $2}')
+kbread=$(echo "$output" | awk '{print $3}')
+kbwritten=$(echo "$output" | awk '{print $4}')

-
# Comparing the result and setting the correct level:
-if ( [ "`echo "$tps >= $crit_tps" | bc`" == "1" ] || [ "`echo "$kbread >= $crit_read" | bc`" == "1" ] || \
+if ( [ "`echo "$tps >= $crit_tps" | bc`" == "1" ] || [ "`echo "$kbread >= $crit_read" | bc -q`" == "1" ] || \
[ "`echo "$kbwritten >= $crit_written" | bc`" == "1" ] ); then
msg="CRITICAL"
status=2
bykrzych, February 19, 2013
Anybody cooperated this with nrpe and selinux ? What type of context should it has ?
philippn's changes made this script useful. With out those changes, the averages this check provides be default are fairly worthless.
bykonstantin, May 14, 2012
Hi,

I want to add 2 Hints. The Expression from comma to point is not needed. Just export LANG=C in the script. Then the output of iostat will be dotted.

The second is that I would suggest to use #!/bin/bash as interpreter due to the fact that /bin/sh is linked to /bin/dash in newer distributions. And this script will not work without it.
In ubuntu this has to be ran as a bash script. Also you need 'bc' installed on the system
bymguthrie, February 2, 2012
Gave me exactly what I needed, thanks!
byphilippn, December 2, 2011
2 of 2 people found this review helpful
I've changed a bit to get it working on my server (performance data in MB; showing current read/write, not average vaules since last restart)

http://www.pn-it.com/wp-content/uploads/2011/12/check_iostat / http://www.pn-it.com/linux-ubuntu/nagios-festplatten-mit-check_iostat-uberwachen/
bykforbus, June 9, 2010
4 of 4 people found this review helpful
Very nice plugin. Only change I made was adding "-k" to the lines:
kbread=`$iostat $disk -k | grep $disk | awk '{print $3}'`
kbwritten=`$iostat $disk -k | grep $disk | awk '{print $4}'`

This is because the plugin appears to return blocks read and written per second instead of kilobytes read and written per second. The "-k" option for iostat fixes this.
byapapillon, April 29, 2010
2 of 2 people found this review helpful
Change this lines :
# Doing the actual check:
tps=`$iostat $disk | grep $disk | awk '{print $2}' | sed -e 's/,/./g'`
kbread=`$iostat $disk | grep $disk | awk '{print $3}' | sed -e 's/,/./g'`
kbwritten=`$iostat $disk | grep $disk | awk '{print $4}' | sed -e 's/,/./g'`