check_snmp_cisco_wlc is a Nagios plugin to monitor the status of Cisco Wireless Lan Controller (former Airespace) access points Author: ======= Martin Fuerstenau, Oce Printing Systems GmbH, martin.fuerstenau_at_oce.com History and Changes: ==================== - 10 Aug 2012 Version 1 - First released version. - 04 Dec 2013 Version 1.1 - Added SNMPv3 support. Thanks to Mihail Karageorgiev. - fixed last 3 bytes AP. - 26 Jun 2014 Version 1.2 - Fixed some small issues in help und usage. - Added blacklist support (-B|--blacklist) for AP names. The blacklist is a case sensitive comma seperated list. If used with --isregexp every item of the list is interpreted as regular expression. - 21 Aug 2014 Version 1.3 - Bugfix for blacklisted items. The blacklisted AP was still written to the AP list because it was still in the hash storing all elements. Now it is deleted from the hash instead of skipped only. - 18 Apr 2016 Version 1.4 - Added --showerror_only. This will only show WLCs causing trouble. - 13 Dec 2017 Version 1.5 - Added the amount of APs to output. Syntax: ======= ./check_snmp_cisco_wlc -H -C General: ======== Cisco Wireless Lan Controller (WLC) is in some parts a little bit tricky to monitor. At present this plugin is focussed on the availability of the access points (AP). The plugin test for the status of an AP. If an AP is downloading it is not available. This will give a warning alert. If it is disassociated it will give a critical alert. If an new AP joins the WLC is automatically added with a default name (ap_name.MAC-address). the plugin will determine this and give a warning. This warning disappears if the AP is configured and has a "real" name. The main problem in monitoring AP is the get an alert in case of a breakdown or power off of an AP. This is not a monitorable alert (normally) because the AP simply disappears from the WLC and after a power on it is back. There is no "offline" status to monitor. One method to solve this is to handle over the number of APs. The other more flexible method is to compare it with historical data. Therefore we will have a file to cache to old results (variable $plugin_cache around line 88). In my case to speed up cached results the cache directory is a tmpfs. The plugin compares the old data with the actual data. If there is no old data (first check) the actual data is stored and will be used as old data the next run. If old data is a subset of actual data old data is overwritten with the actual data. If there are APs in the old data but not in the actual data an critical alert is caused. To reset the alarm the cached data (old) can be removed by hand (but I am too lazy for this), by calling the plugin with option -r whith host address and without community string (does the same) or it can be resetted via a trick by acknowdging the problem. Installation ============ Just copy it to your plugin directory. It is not recommended to merge own and third party plugins with the standard ones. Check the perl path in the head of the script and change the plugin cache directory path to your needs Here is how it works ==================== 1. Options ---------- check_snmp_cisco_wlc_1.4 --help This monitoring plugin is free software, and comes with ABSOLUTELY NO WARRANTY. It may be used, redistributed and/or modified under the terms of the GNU General Public Licence (see http://www.fsf.org/licensing/licenses/gpl.txt). Usage: check_snmp_cisco_wlc_1.4 [ -H ] [ -t ] [ -r] [ -C|--community= ] [ -v|--snmpversion=<1|2c|3> ] [ -u|--username ] [ -a|--authpassword ] [ -p|--privpassword ] [ -A|--authprotocol= ] [ -P|--privprotocol= ] [--port=] [--showerror] [--multiline] This plugin checks the status of the Access Points for a Cisco Wireless Lan Controller (WLC). -h, --help Print detailed help screen -V, --version Print version information -H, --hostaddress=STRING hostaddress/IP-Adress to use for the check. -C, --community=STRING SNMP community that should be used to access the switch. -v, --snmpversion=STRING Possible values are 1 or 2c or 3. -u, --username=STRING SNMPv3 username. -a, --authpassword=STRING SNMPv3 authpassword. -p, --privpassword=STRING SNMPv3 privpassword. -A, --authprotocol=STRING SNMPv3 authprotocol SHA|MD5. -P, --privprotocol=STRING SNMPv3 privprotocol AES|DES. --port=INTEGER If other than 161 (default) is used) -t, --timeout=INTEGER Seconds before plugin times out (default: 15) -r Recover - It kicks out old collected data so that the next check is ok. -B, --exclude= Blacklist Access Points.This means a comma seperated list. BEWARE! Blacklist is case sensitive. --isregexp Treat blacklist as regexp --multiline Multiline output in overview. This mean technically that a multiline output uses a HTML
for the GUI instead of \n Be aware that your messing connections (email, SMS...) must use a filter to file out the
. A sed oneliner will do the job. --showerror Multiline output in overview ind case of an error. uses
. See above. --showerror_only. This will only show WLCs causing trouble. 2. Command definition --------------------- define command{ command_name check_cisco_wlc command_line /usr/lib/nagios/my_plugins/check_snmp_cisco_wlc -H $HOSTADDRESS$ -C $ARG1$ --showerror } 3. Service check definition --------------------------- define service{ active_checks_enabled 1 passive_checks_enabled 1 parallelize_check 1 obsess_over_service 1 check_freshness 0 notifications_enabled 1 event_handler_enabled 1 flap_detection_enabled 1 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 host_name cisco-wlc service_description AccessPoints is_volatile 0 check_period 24x7 max_check_attempts 5 normal_check_interval 5 retry_check_interval 2 contact_groups network-adm,wlc-recover notification_interval 1440 notification_period 24x7 notification_options c,w,r check_command check_cisco_wlc!public } contactgroup wlc-recover is important. A direct contact is also possible- 4. contactgroup wlc-recover --------------------------- This contactgroup only contains one member: define contactgroup{ contactgroup_name wlc-recover alias Removes WLC historic data members wlc-recover } 5. contact wlc-recover ---------------------- Look at the service_notification_commands. From service_notification_options we only need option r but unfortunately sending a notification on for recovery is not possible. define contact{ contact_name wlc-recover alias wlc-recover service_notification_period 24x7 host_notification_period 24x7 service_notification_options c,w,r host_notification_options n service_notification_commands recover-cisco-wlc host_notification_commands host-notify-by-email email dummy@dummy.com } 6. Definition of recover-cisco-wlc ---------------------------------- This definition call a wrapper shell script becaus we must filter out notifications for all other than r: define command{ command_name recover-cisco-wlc command_line /usr/lib/nagios/my_plugins/wlc-recover "$NOTIFICATIONTYPE$" "$HOSTADDRESS$" } 7. The little wrapper script ---------------------------- #!/bin/bash NOTIFICATIONTYPE=$1 HOSTADDRESS=$2 if [ "$NOTIFICATIONTYPE" = "RECOVERY" ] then /usr/lib/nagios/my_plugins/check_snmp_cisco_wlc -H $HOSTADDRESS -r fi With this trick a member of the contactgroup network-adm can acknowledge the problem (which means "Yeah - I kicked out the AP. It's ok") and reset it to green.