Search Exchange

Search All Sites

Nagios Log Server Now Available - Download Now

Nagios Live Webinars

Let our experts show you how Nagios can help your organization.

Contact Us

Phone: 1-888-NAGIOS-1
Email: sales@nagios.com

Login

Remember Me

box293_check_vmware

Bookmark and Share

Rating
6 votes
Favoured:
1
Current Version
2014-12-13
Last Release Date
2014-12-13
Compatible With
  • Nagios 3.x
  • Nagios 4.x
  • Nagios XI
Owner
E-mail
License
GPL
Hits
15566
Files:
FileDescription
box293_check_vmware.zipPlugin and Manual
Manual.pdfManual
This Plugin allows you to monitor a VMware vCenter / ESX(i) environment using your Nagios monitoring solution.

IMPORTANT:
This Plugin is NOT designed to be run on your Nagios host, instead it is offloaded to the VMware vSphere Management Assistant (vMA). This is due to some performance issues that occur with the VMware SDK which can easily overload your Nagios host.

How all of this works is explained in the manual including full instructions to get you up and running as quickly as possible.

The manual is included with the plugin.

The plugin allows you to monitor the following:
Cluster_CPU_Usage
Cluster_DRS_Status
Cluster_EVC_Status
Cluster_HA_Status
Cluster_Memory_Usage
Cluster_Resource_Info
Cluster_Swapfile_Status
Cluster_vMotion_Info
Datastore_Cluster_Status
Datastore_Cluster_Usage
Datastore_Performance
Datastore_Performance_Overall
Datastore_Usage
Guest_CPU_Info
Guest_CPU_Usage
Guest_Disk_Performance
Guest_Disk_Usage
Guest_Memory_Info
Guest_Memory_Usage
Guest_NIC_Usage
Guest_Snapshot
Guest_Status
Host_CPU_Info
Host_CPU_Usage
Host_License_Status
Host_Memory_Usage
Host_OS_Name_Version
Host_pNIC_Status
Host_pNIC_Usage
Host_Status
Host_Storage_Adapter_Info
Host_Storage_Adapter_Performance
Host_Switch_Status
Host_vNIC_Status
vCenter_License_Status
vCenter_Name_Version
To Do / Wish List
Here is a list of items that are going to be addressed sometime in the future:
* Guest_Snapshot > Option to exclude some snapshots (primarily for backup products that create/remove snapshots)
* Choose which metrics are checked. For example Guest_Disk_Performance returns Rate, Latency and Averaged. Goal is to be able to only choose the ones you want.
* For certain checks, being able to manipulate the object being targeted. For exmaple your Nagios host objects have the address serverxx.box293.local but they are named in the vCenter inventory as serverxx. The goal is to allow you to manipulate parts of a value that the plugin receives. This will allow for the use of more generic service definitions in Nagios which means less configurations required.
* Being able to report which host a guest is running on, and then use event handlers to update a nagios host's parent. This helps for VM's that are a member of a cluster.
* A request was made to output usage checks with percentage values as well, for checks like Cluster_CPU_Usage, Cluster_Memory_Usage, Datastore_Usage, Host_CPU_Usage
* For any host related checks where the host is in standby mode, return the status as OK instead of critical


I have a mailing list that I will send an email to when I update this plugin. This way you can find out as soon as a new version of this plugin is available.
To Subscribe:
* Send an email to updates+subscribe@box293.com
* You will receive an email with a link you need to follow to create a subscription request
* Click the link to open it in a web browser
* You will need to type your email address and click submit
* You will receive another email with a link you need to follow to complete your subscription
* You will now be subscribed!
* Check your spam folder if the emails are not received


Twitter: @Box293

Version Notes:
2014-04-15
* Offical release version

2014-05-07
* Fixed bug where hosts were incorrectly reporting they are in Maintenance Mode (reported by Marvin Holze and Steven Miller)
* Added functions for upcoming Nagios XI Wizard

2014-05-09
* Fixed bug in Cluster_Memory_Usage check where the Memory Used was not being correctly reported (reported by Vitaly Burshteyn). This also affected the Cluster_Resource_Info check.

2014-05-10
* Fixed bug in Cluster_CPU_Usage check where the CPU Used was not being correctly reported (reported by Vitaly Burshteyn). This also affected the Cluster_Resource_Info check.

2014-08-24
* Improved debugging, creates a debugging file when in debug mode
* All checks that output performance data now have the name of the check appended to the end of the performance data surrounded by square brackets. This makes the use of templates in PNP easy
* Fixed bug in Host_pNIC_Status where the incorrect amount of pNICs were being calculated when specifying which pNICs to check
* Fixed bug in Host_pNIC_Status where --nic_state was not correctly triggering a CRITICAL state
* Fixed bug in Host_pNIC_Status where the phrase "NOT Connected" was appearing twice on a disconnected pNIC
* Fixed bug with Host_Switch_Status check, only the first switch was being reported and would not find more than one switch if the host had more than one
* Fixed bug with Guest_Disk_Usage where the "Disk Usage" was reported as 0 when the guest had snapshots
* Added a Version argument to report the plugin version
* Added check Guest_Status which reports on Power State, Uptime, VMware Tools Version and Status, IP Address, Hostname, ESX(i) Host Guest Is Running On, Consolidation State and Guest Version

2014-12-13
* Added option AlwaysOK for drs_automation_level so the check will always return an OK state (requested by Willem D’Haese)
* Added option AlwaysOK for drs_dpm_level so the check will always return an OK state (requested by Willem D’Haese)
* Added option AlwaysOK for ha_host_monitoring so the check will always return an OK state
* Added option AlwaysOK for ha_admission_control so the check will always return an OK state
* Added check Datastore_Performance_Overall which will return the Datastore Performance for ALL connected hosts to the datastore (requested by Willem D’Haese)
* Added check Datastore_Cluster_Usage (requested by snapon_admin)
* Added check Datastore_Cluster_Status (requested by snapon_admin)
* Updated the Nagios XI Wizard checks List_Datastores, List_Guest, List_Hosts and List_vCenter_Objects with improved encoding to allow UTF-8 characters (reported by DingGuo Xiao)
* Fixed bug in Datastore_Usage to limit the amount of decimal places returned for the Used Space value
* Fixed bug in certain checks like Guest_Snapshot where guests have special characters like a backslash (reported by Dennis Peere)
* Updated Host_Status checks to report Triggered Alarms and trigger warning and critical states if the alarms have not been acknowledged in vCenter (requested by Pierre-François Gallic, Ian Bergeron, Jacob Estrin, Brice Courault)
* Added argument --perfdata_option which allows you to disable the check name being appended to the end of the performance data string in square brackets, as some monitoring systems like Centreon do not like this (reported/requested by Bruno Guerpillon)
Reviews (4)
This plugin was simple to install and I had it running checks against two vCenters and several ESXi hosts in no time. What questions I did have, the developer answered almost immediately. If you are looking to use Nagios to give an eye into VMware you have found the right tool.
Obviously a lot of work has gone into the implementation of this service check and the accompanying documentation. A qualified VMware and Nagios systems administrator will have no issue getting things up and running with relative ease.

In my case I am using a single Nagios server and a single VMA appliance (new for this purpose) to monitor two vCenter systems and the underlying clusters, hosts and guests.

I did initially have an issue in which some instances of the service check failed on one of the vCenter servers, but this was traced to a problem with the vCenter embedded database, and the issue "magically" resolved itself after a database rebuild.
byfusfeld, June 27, 2014
Plugin had a couple hiccups to get installed but the author was incredibly helpful and we were able to work past it. Still by far the best documented plugin i've seen, and works as advertised. Very very helpful.
byssmiller_gfsu, May 6, 2014
1 of 1 people found this review helpful
This plugin seems far more complete then most others. I found a few minor bugs with Cluster_Memory_Usage, Cluster_CPU_Usage, and Host_Status. The diff file will fix this as of 5/6/2014):

vi-admin@mpvmat:~> diff -u orig/box293_check_vmware.pl box293_check_vmware.pl
--- orig/box293_check_vmware.pl 2014-05-06 10:09:44.000000000 -0400
+++ box293_check_vmware.pl 2014-05-06 14:55:18.000000000 -0400
@@ -714,7 +714,7 @@
my $host_maintenance_mode = $cluster_current_host->get_property('summary.runtime.inMaintenanceMode');

# See if the host is in maintenance mode
- if ($host_maintenance_mode eq 'true') {
+ if ($host_maintenance_mode eq 'false') {
# Get the overall CPU used by the current host
my $cluster_cpu_usage_current_host = $cluster_current_host->get_property('summary.quickStats.overallCpuUsage');
# Convert the $cluster_cpu_usage_current_host to SI
@@ -726,7 +726,7 @@
# Get how many cores this host has
$cpu_cores_available = $cpu_cores_available + $cluster_current_host->get_property('summary.hardware.numCpuCores');

- } # End if ($host_maintenance_mode eq 'true') {
+ } # End if ($host_maintenance_mode eq 'false') {
} # End if ($host_uptime_state_flag == 0) {
} # End if ($host_connection_state_flag == 0) {
} # End foreach (@{$cluster_hosts}) {
@@ -1181,7 +1181,7 @@
my $host_maintenance_mode = $cluster_current_host->get_property('summary.runtime.inMaintenanceMode');

# See if the host is in maintenance mode
- if ($host_maintenance_mode eq 'true') {
+ if ($host_maintenance_mode eq 'false') {
# Get the overall memory used by the current host
my $cluster_memory_usage_current_host = $cluster_current_host->get_property('summary.quickStats.overallMemoryUsage');
# Convert the $cluster_memory_usage_current_host to SI
@@ -1189,7 +1189,7 @@

# Add this to cluster_memory_usage
$cluster_memory_usage = $cluster_memory_usage + $cluster_memory_usage_current_host;
- } # End if ($host_maintenance_mode eq 'true') {
+ } # End if ($host_maintenance_mode eq 'false') {
} # End if ($host_uptime_state_flag == 0) {
} # End if ($host_connection_state_flag == 0) {
} # End foreach (@{$cluster_hosts}) {
@@ -4167,7 +4167,7 @@
} # End if (defined($host_issues)) {

# See if the host is in maintenance mode
- if ($host_maintenance_mode eq 'false') {
+ if ($host_maintenance_mode eq 'true') {
$exit_message = Build_Exit_Message('Exit', $exit_message, 'Host in Maintenance Mode');
$exit_state = Build_Exit_State($exit_state, 'OK');
$host_status_flag = 1;
Owner's reply

Thanks for that ssmiller_gfsu, these have all been fixed in release 2014-05-10.