Search Exchange
Search All Sites
Nagios Live Webinars
Let our experts show you how Nagios can help your organization.Login
Directory Tree
box293_check_vmware
- Nagios 3.x
- Nagios 4.x
- Nagios XI
File | Description |
---|---|
box293_check_vmware.zip | Plugin |
Manual.zip | Manual |
Here is a list of items that are going to be addressed sometime in the future:
* Look at the viability of checking the internal disks of the guest operating systems
* A host swap usage check
* For Guest_Snapshot check, only show snapshots that exceed the defined thresholds (instead of all snapshots)
* For Guest_Snapshot check, the snaphot generating a critical should be at the beginning of the service output
* Allow the Datastore_Usage check to work for all datastores instead of specifically needing to define them
* Allow Cluster_DRS_Status to return a warning instead of a critical for things like DPM state?
* CPU ready (and CPU load) of ESXi servers
* vCenter top-level alarms
* Performance data output without units. Instead of '23GHz' output '23'.
* For the Host_Status check see if the name of the problem can be included in the status
* Host_pNIC_Status / Host_pNIC_Usage - Option to to skip pNIC not added to any vSwitch
* Make it possible to split the Guest_Status in different "subchecks" (Tools, GuestIp, etc.)
* New check(s) "Cluster_Datastore_*" .. list all Datastore in a specific cluster, like Datastore_* checks today, but without having to specify datastore names manually
Version Notes:
2014-04-15
* Offical release version
2014-05-07
* Fixed bug where hosts were incorrectly reporting they are in Maintenance Mode (reported by Marvin Holze and Steven Miller)
* Added functions for upcoming Nagios XI Wizard
2014-05-09
* Fixed bug in Cluster_Memory_Usage check where the Memory Used was not being correctly reported (reported by Vitaly Burshteyn). This also affected the Cluster_Resource_Info check.
2014-05-10
* Fixed bug in Cluster_CPU_Usage check where the CPU Used was not being correctly reported (reported by Vitaly Burshteyn). This also affected the Cluster_Resource_Info check.
2014-08-24
* Improved debugging, creates a debugging file when in debug mode
* All checks that output performance data now have the name of the check appended to the end of the performance data surrounded by square brackets. This makes the use of templates in PNP easy
* Fixed bug in Host_pNIC_Status where the incorrect amount of pNICs were being calculated when specifying which pNICs to check
* Fixed bug in Host_pNIC_Status where --nic_state was not correctly triggering a CRITICAL state
* Fixed bug in Host_pNIC_Status where the phrase "NOT Connected" was appearing twice on a disconnected pNIC
* Fixed bug with Host_Switch_Status check, only the first switch was being reported and would not find more than one switch if the host had more than one
* Fixed bug with Guest_Disk_Usage where the "Disk Usage" was reported as 0 when the guest had snapshots
* Added a Version argument to report the plugin version
* Added check Guest_Status which reports on Power State, Uptime, VMware Tools Version and Status, IP Address, Hostname, ESX(i) Host Guest Is Running On, Consolidation State and Guest Version
2014-12-13
* Added option AlwaysOK for drs_automation_level so the check will always return an OK state (requested by Willem D’Haese)
* Added option AlwaysOK for drs_dpm_level so the check will always return an OK state (requested by Willem D’Haese)
* Added option AlwaysOK for ha_host_monitoring so the check will always return an OK state
* Added option AlwaysOK for ha_admission_control so the check will always return an OK state
* Added check Datastore_Performance_Overall which will return the Datastore Performance for ALL connected hosts to the datastore (requested by Willem D’Haese)
* Added check Datastore_Cluster_Usage (requested by snapon_admin)
* Added check Datastore_Cluster_Status (requested by snapon_admin)
* Updated the Nagios XI Wizard checks List_Datastores, List_Guest, List_Hosts and List_vCenter_Objects with improved encoding to allow UTF-8 characters (reported by DingGuo Xiao)
* Fixed bug in Datastore_Usage to limit the amount of decimal places returned for the Used Space value
* Fixed bug in certain checks like Guest_Snapshot where guests have special characters like a backslash (reported by Dennis Peere)
* Updated Host_Status checks to report Triggered Alarms and trigger warning and critical states if the alarms have not been acknowledged in vCenter (requested by Pierre-François Gallic, Ian Bergeron, Jacob Estrin, Brice Courault)
* Added argument --perfdata_option which allows you to disable the check name being appended to the end of the performance data string in square brackets, as some monitoring systems like Centreon do not like this (reported/requested by Bruno Guerpillon)
2015-01-29
* Fixed bug in Guest_CPU_Usage where high CPU usage could result in a negative free value
* Added --modifier argument to allow request and response data to be modified for Host and Guest checks (requested by Willem D’Haese). An exmaple how this is used: your Nagios host objects have the address serverxx.box293.local but they are named in the vCenter inventory as serverxx. The --modifier argument will allow you to remove the '.box293.local'. This allows for the use of more generic service definitions in Nagios which means less configurations required. Detailed examples are provided in the manual
* Added Guest_Host check for determining if the ESX(i) host the guest is running on matches the parent_hosts defined in Nagios (requested by Virgil Hoover and other attendees at the Nagios World Conference 2014). This check will work in conjunction with the upcoming box293_event_handler plugin to run on the Nagios host ... stay tuned!
* Added the --query_url, --query_username, --query_password and --service_status_info arguments to allow the plugin to query Nagios for checks like Guest_Host to determine Nagios parent object directive
* Added more debugging to the Nagios XI Wizard List_xxx checks
* --debug option will now show how long the plugin ran for
* All cluster checks now report the name of the cluster at the beginning of the status output (requested by Willem D’Haese)
2015-03-03
* Added argument --exclude_snapshot to be used with the Guest_Snapshot check. This allows you to exclude snapshots that contain specific text in the NAME of the snapshot (requested by Pierre-François Gallic)
* Changed --perfdata_option to allow you to specify what metrics you want the specific check to use / report on, applies to all checks that return performance data. See manual for full details for each check (requested by Bruno Guerpillon)
* Fixed bug in Cluster_HA_Status that caused check to fail when the Slot Size had been defined using vSphere Web Interface (reported by Daniel Vleeshakker)
* Re-fixed bug in Datastore_Usage to limit the amount of decimal places returned for the Used Space value
* Fixed bug in Datastore_Cluster_Usage to limit the amount of decimal places returned for the Used Space value
* Manual now recommends using the 'nice' command to execute box293_check_vmware. This makes the plugin execute at lower process schedule and makes the vMA more stable
2015-05-21
* Updated all host related checks to return an OK status IF the host is in Standby Mode. Specifically applies to the checks Datastore_Performance, Datastore_Performance_Overall, Host_CPU_Info, Host_CPU_Usage, Host_License_Status, Host_Memory_Usage, Host_OS_Name_Version, Host_pNIC_Status, Host_pNIC_Usage, Host_Status, Host_Storage_Adapter_Info, Host_Storage_Adapter_Performance, Host_Switch_Status, Host_vNIC_Status
* Created the check Host_Up_Down_State to be used as a host object check, helpful for hosts that are in Standby Mode and you don't want to be alerted about this as Standby Mode is normal behaviour. This check also introduced the argument --standby_exit_state which allows you to report a DOWN state if the host is in standby mode
* Standby checks Requested by Willem D'Haese and Hans Bos
* Fixed bug in Datastore_Cluster_Status check where it was not returning any output, reported by Luc Lesouef
2015-08-03
* Updated all checks to work with vSphere API versions v 4.0 onwards. Some features get introduced by VMware in different API releases and the plugin was not allowing for these differences. API problem reported by Andrea Setti. Specific checks updated are:
** Cluster_CPU_Usage
** Cluster_Memory_Usage
** Datastore_Cluster_Status (only valid in vSphere 5.0 onwards)
** Datastore_Cluster_Usage (only valid in vSphere 5.0 onwards)
** Guest_CPU_Info (# of cores only reported in vSphere 5.0 onwards, CPU Reservation only reported on directly connected ESXi hosts v 5.0 onwards ... via vCenter works for 4.0 onwards)
** Guest_CPU_Usage
** Guest_Disk_Performance
** Guest_Disk_Usage
** Guest_Memory_Info (Memory Reservation only reported on directly connected ESXi hosts v 5.0 onwards ... via vCenter works for 4.0 onwards)
** Guest_Memory_Usage
** Guest_NIC_Usage (Packets only reported for VMs running on ESXi hosts 5.0 onwards)
** Guest_Status (Uptime only reported for guests running on ESXi hosts 4.1 onwards, consolidation state only reported for VMs running on ESXi hosts 5.0 onwards)
** Host_CPU_Info
** Host_CPU_Usage
** Host_License_Status
** Host_Memory_Usage
** Host_OS_Name_Version
** Host_pNIC_Status
** Host_pNIC_Usage
** Host_Status
** Host_Storage_Adapter_Info
** Host_Storage_Adapter_Performance (will not work on hosts less than 4.1)
** Host_Switch_Status
** Host_Up_Down_State (no uptime or perfdata on hosts less than 4.1)
** Host_vNIC_Status
* Fixed List_Hosts check used by Nagios XI wizard so that it correctly detectsif a host has storage adapters or datastores. Reported by maddev
* Fixed some issues with guest consolidation detection
* Updated Guest_Status to alert if guestToolsNotRunning is detected, critical by default. Reported by Olivier Cheron
2016-03-24
* Fixed RAW disk mapping for Guest_Disk_Usage, reported by Wibo Lammerts.
* Fixed bug with Guest_Status check where the IP Address objects were not being accessed correctly. Reported by Sebastian Hutter and Peter Stavanja.
* Fixed bug where Guest Consolidation was not returning the correct exit state. Reported by Olivier Cheron, Richard Temple and Jonathan Young.
* Fixed a bug in the --debug option that was overwriting the debug log when the plugin reached the end.
* Updated --debug option so it would create the debug log file in the directory which the plugin is run from.
* Added Percentages as metric to check, inlcuding warning and critical thresholds. Optional and not included by default. Requested by Jeroen van Schelt. Applies to:
** Cluster_CPU_Usage
** Cluster_Memory_Usage
** Cluster_Resource_Info
** Datastore_Cluster_Usage
** Datastore_Usage
** Guest_CPU_Usage
** Guest_Disk_Usage
** Guest_Memory_Usage
** Host_CPU_Usage
** Host_Memory_Usage
* Added the ability to use a config file for storing plugin preferences, currently --concurrent_checks, --server, --timeout can now be defined in config file ~/.visdkrc as a way of reducing check command complexity. Refer to the manual on how to use the config file.
* Added check Tasks_Events to allow you to search the Tasks and Events and match/nomatch a string. Requested by Andrew Haynes.
* Fixed bug in List_Datastores check which was not correctly detecting if a datastore has an offline hosts, causing the Nagios XI wizard to report "No datastores found!". Reported by dlukinski.
* Updated Perfdata_Process fuction to better detect if the timestamp value exists. Reported by Alexander Golikov.
* Added error checking to correctly report username/password issues instead of 'returned status 1'.
2016-05-10
* Major performance improvments to the script due to switch statements being replaced with if/else statements.
** Expect a two fold decrease in CPU usage of the plugin and significantly reduced execution times.
** Plugin is now compatible with the Centreon Perl Connector (the reason behind the plugin overhaul). NOTE: Centreon Perl Connector is not officially supported by me, end user was responsible for overhauling the plugin to allow it to work with the Centreon Perl Connector.
** Plugin overhaul undertaken by CPF-Informatique.
* Datastore_Performance and Datastore_Performance_Overall checks have been improved, they now work with NFS datastores. Requested by Branislav7, Nicola Bianchi, David Beck, Christoph Leitl. Improvements performed by CPF-Informatique.
* Script now retries when communication with the VMware API fails and properly exits when it did not succeed. Default number of retries is 2. This helps preventing empty script outputs. Improvements performed by CPF-Informatique.
* Uncommented some code I had commented out for testing and forgot about, guest checks are now correctly optimized.
* Added new check vSphere_Desktop_License to query the Desktop Host Licenses and allow thresholds to be triggered for used or free. Requested by Jason Dunn.
2016-10-02
* Correct a bug where UP status was returned instead of WARNING/CRITICAL, and correct some undefined variables. Corrections performed by CPF-Informatique.
* Plugin now checks for for pipe symbols before performance data string and removes them if present, reported by Pavel Novotný when using the Host_Storage_Adapter_Performance check (fix applies to any check with performance data).
* Resolved some issues with Tasks_Events checks that ended up failing with "communication with the VMware API failed after 2 retries", reported by Yann Renard.
* Fixed bug where the plugin was not reporting that it could not find an object (like in the case where the end user incorrectly typed the object), it was instead reporting "communication with the VMware API failed after 2 retries".
* Fixed bug in Cluster_Memory_Usage if there were no hosts in the cluster.
* Replaced typographical quotes with normal quotes in pod help to stop wide character error, reported by Kent Johannessen and Sebastian Schneider.
* Fixed bug in Guest_CPU_Usage where status output was displaying the value for each core as the total value. This only occurred in the status output and NOT the performance data string, hence all existing collected performance data is valid.
* Fixed a bug in the --debug option that was overwriting the debug log when the plugin reached the end for the checks List_Datastore_Clusters, List_Datastores, List_Guests, List_Hosts, List_vCenter_Objects.
* Guest_Snapshot performance improvements when querying a larger amount of guests, code improvements supplied by Aaron Cheeseman.
* Added check Host_Service to check the services running on a Host, the startup policy or if they are running, requested by John Chivian.
* Added check Cluster_Time_Drift to check for a NTP time drift for all the hosts in a cluster, requested by John Chivian.
* Added some extra object testing in the Host_vNIC_Status check to prevent check from stalling and consuming 100% CPU, reported by Willem D’Haese.
For those on vCenter 6.7 that replaced the VMA with MSW using https://github.com/T-M-D/MSW/blob/master/Deploy.md
and ran into error:
UNKNOWN: Server version unavailable at 'https://xxx:443/sdk/vimService.wsdl' at /usr/lib/perl5/5.10.0/VMware/VICommon.pm line 734
Security has been tightened in vCenter and and the VMA/MSW needs the vCenter root certificates to accept the SSL certificate from your vCenter server.
(On a side note I used Ubuntu 20 for the MSW, the CentOS I had issues getting all Perl modules to work/install).
on the vCenter server, on main website (enter only FQDN , without any /) , there is a "Download trusted root CA certificates", save as ZIP.
Extract, put contents (a 'certs' folder) on your MSW. I chose /home/vi-admin as location.
Then on MSW server, set 2 environmental variables (HTTPS_CA_FILE and HTTPS_CA_DIR). Not sure both are needed but it works for me.
My vi-admin user is set to a /bin/bash shell (using: usermod --shell /bin/bash vi-admin) so my approach is putting the vars in user home dir .bashrc file:
$nano ~/.bashrc
export HTTPS_CA_FILE='/home/vi-admin/certs/lin/96e3bd6f.0'
export HTTPS_CA_DIR='/home/vi-admin/certs/lin'
And a test to see it works:
vi-admin@monitoring-msw1:~$ ~/box293_check_vmware.pl --server YOUR_VCENTER_FQDN_HERE --check vCenter_Name_Version
OK: VMware vCenter Server 6.7.0 build-17137327
then exit, back to the main monitoring server:
nagios@monitoring1:~$ /usr/lib/nagios/plugins/check_by_ssh -E 1 -l vi-admin -H YOUR_MSW_IP_HERE -C "~/box293_check_vmware.pl --server YOUR_VCENTER_FQDN_HERE --check vCenter_Name_Version"
OK: VMware vCenter Server 6.7.0 build-17137327
More info:
https://code.vmware.com/docs/6530/vsphere-sdk-for-perl-installation-guide/doc/GUID-0F962FBE-A919-4997-9152-CA8D21AAC0DE.html
https://kb.vmware.com/s/article/2108416
MSW setup (Ubuntu 20):
cd /tmp
tar -xvzf VMware-vSphere-Perl-SDK-6.7.0-8156551.x86_64.tar.gz
sudo apt-get update
sudo apt-get install lib32z1 lib32ncurses6 build-essential uuid uuid-dev libssl-dev perl-doc libxml-libxml-perl libcrypt-ssleay-perl libsoap-lite-perl libmodule-build-perl
cd vmware-vsphere-cli-distrib/
cpan -i CPAN
cpan -i UUID
cpan -i Date::Format
./vmware-install.pl EULA_AGREED=yes
useradd vi-admin
passwd vi-admin
cd /home/
mkdir vi-admin
chown -R vi-admin:vi-admin vi-admin
usermod --shell /bin/bash vi-admin
chown vi-admin:vi-admin box293_check_vmware.pl
#fixing .ssh/authorized_keys - access & allowed key type:
chmod 700 .ssh
nano /etc/ssh/sshd_config #and insert: PubkeyAcceptedKeyTypes=+ssh-dss
Please contact me via email plugins@box293.com. I don't get emails from reviews here. To answer your question, yes it does work with 6.7. I haven't done much development recently as it seems to work OK. I will release a newer version sometime soon to fix some minor issues.
We recently migrated our vCenter server to a vCenter Server Appliance 6.7U1. Will this plugin work with vCenter Server Appliance 6.7U1?
This the output from the check:
vi-admin@vma:~> ~/box293_check_vmware.pl --server 192.168.103.210 --check vCenter_Name_Version
Server version unavailable at 'https://192.168.103.210:443/sdk/vimService.wsdl' at /usr/lib/perl5/5.10.0/VMware/VICommon.pm line 726.
vi-admin@vma:~>
vSphere 6.7 requires the updated SDK however it's not easy to install this on the old vMA appliance. VMware have also depreciated the vMA appliance and hence you will not be able to download an updated vMA with the new SDK.
I've create some instructions here on how you can deploy a replacement vMA on CentOS 7:
https://github.com/T-M-D/MSW/blob/master/Deploy.md
Can any one help me out how could be collect information about 'Used Reservation' of Memory / CPU....
This is what I found:
my $epoch_oldest = $Host_Epoch_Hash{$Host_Epoch_Array[0]}{'epoch'};
my $epoch_newest = $Host_Epoch_Hash{$Host_Epoch_Array[-1]}{'epoch'};
This is what I changed:
my $epoch_oldest = $Host_Epoch_Hash{$Host_Epoch_Array_Sorted[0]}{'epoch'};
my $epoch_newest = $Host_Epoch_Hash{$Host_Epoch_Array_Sorted[-1]}{'epoch'};
Checks works fine now!
Using with esxi v6.0
Only Guest_Status,Guest_Snapshot,Guest_Disk_Usage working.
All other Guest_* command output is UNKNOWN: Guest is powered OFF or is not accesible, cannot collect data!
Did you resolve this issue? Please email plugins@box293.com.
One notice: in manual the ssh key generation is done/documented as DSA, but this is deprecated and is now RSA.
- On Page 9 / 83 in the manual:
The line "Type ssh-keygen -t dsa and press Enter" -> this should be "Type ssh-keygen -t rsa and press Enter"
Hope this helps others, it took me some time to figure this out :-)
Thanks for the good job :)
The amount of arguments allowed is up to 32 in Nagios Core.
https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/macrolist.html#arg
$ARGn$ The nth argument passed to the command (notification, event handler, service check, etc.). Nagios supports up to 32 argument macros ($ARG1$ through $ARG32$).
If you are using CCM in Nagios XI or something like NagiosQL, the GUI limits you to 8 arguments, but I think you can add the extra ones as custom object variables.
Updated to the newer version today without any problems.
@Martijn Dirkx, after unzipping the zip file, please rename to .zip and unzip it again. The perl script is actually in another zip again :)
Thanks DanielV_, not sure why this is happening.
I am currently running the SDK on Nagios Core 4.1.1 (latest version). We have ESXi 6.0 running with a working vMA. Everything in the manual has the same output on my vMA and Nagios server except the part where I can test the plugin from the vMA. I run this command:
~/box293_check_vmware --server xxx.xxx.xxx.xxx --check vCenter_Name_Version
And then I get the following error message:
-bash: /home/vi-admin/box293_check_vmware: cannot execute binary file
Have you seen this error before and know how to solve it? Thanks!
Kind regards,
Martijn Dirkx
Please email me on the address above if you need further help.
Troy is a great developer and support from him is awesome.
First of all Great plugin. Works like a charm.
But its doesn't return the alerts to Operation center. I couldn't able to find the service alerts in operation center screen.
Why...?
Can you confirm will it be integrated with Ops Center.
Thanks and Regards
Srini
Please send an email to the email address above and I can look into this.
I highly recommend if you need to monitor your vmware infrastructure.
Had some issues with the check once. Found out it was a locked user account. Great support from Troy!
All in all: keep up the good work! :)
Troy is a great developper who try to take in consideration any idea that could lead to improve his job.
Thanks !