Search Exchange

Search All Sites

Nagios Live Webinars

Let our experts show you how Nagios can help your organization.

Contact Us

Phone: 1-888-NAGIOS-1
Email: sales@nagios.com

Login

Remember Me

check_ganglia

Rating
1 vote
Favoured:
1
Hits
99787
Files:
FileDescription
check_ganglia.plcheck_ganglia.pl 0.9 release
Network Monitoring Software - Download Nagios XI
Log Management Software - Nagios Log Server - Download
Netflow Analysis Software - Nagios Network Analyzer - Download
check_ganglia.pl -- A nagios plugin allowing checking of Ganglia (gmetad) XML entries. Supports all arbitrary XML data, standard ganglia metrics, gmetric-introduced points, etc.
estair@monitor02 libexec$ ./check_ganglia.pl --help
Unknown option: help
UNKNOWN: HOST not defined.

-H hostname/IP: of host to connect to gmetad/gmond on
-P Port: to connect to and retrieve XML
-O Output method: ('cluster' to dump all info | 'hostcheck' to grab for)
-T Targethost: when 'hostcheck', the host to pull data for
-M Metric: the 'gmetric' defined value to return exclusively
-w warn: int value above which the check will exit in a WARN state
-c crit: int value above which the check will exit in a CRITICAL state




Use plugin for a specific check-host command, determine if host has checked in to ganglia cluster recently:

define command{

command_name check-cluster-host-alive
command_line $USER1$/check_ganglia.pl -H $HOSTADDRESS$ -P 8600 -O hostcheck -T localhost -M host_state
}



.........


# pull data direct from host via XML call, using a specified query string (!!> 100x faster !!)
define command{

command_name check_ganglia_host_query
command_line $USER1$/check_ganglia.pl -host=localhost -port=8652 -output=hostcheck -cluster=$ARG1$ -targethost=$HOSTALIAS$ -M $ARG2$
}




define command{

command_name check_ganglia
command_line $USER1$/check_rrd_eli.pl "/var/ganglia/rrds/$HOSTGROUPNAME$/$HOSTNAME$.lucasfilm.com/$ARG1$" sum $ARG2$ $ARG3$
}

Reviews (1)
Two things are needed. First of all, you need to run "cpan DateTime::Format::Epoch::Unix" to install this module. If anyone knows an RPM on RHEL/Centos please leave a comment.
Next, the biggest problem, the correct exit status when treshhold is reached doesn't work. So I fixed it. I'm not a Perl programmer, so if it looks somewhat crude, please excuse me.
Copy/paste the following code and run it as a patch:
--- /tmp/check_ganglia.pl 2013-03-21 12:43:48.000000000 +0100
+++ check_ganglia.pl 2013-03-22 08:38:48.574700147 +0100
@@ -14,7 +14,6 @@
# TODO: call $cluster{host} hash directly instead of seeking within it.
# TODO: Fix some clusters that don't match host checks...
# TODO: add retval matching (range, string, etc)
-# TODO: fix warn/crit to measure returned metric
# TODO: !!! NEXT !!! call $cluster{host} hash directly instead of seeking within it.
# TODO: !!! NEXT !!! better, pass in cluster:host context for direct passing of XML
# TODO: !!! NEXT !!! use syntax localhost:8652 TCP
@@ -22,6 +21,7 @@
# TODO: !!! NEXT !!! Or, since that requires knowing the CLUSTER, make it option and check the hostname as key
# TODO: !!! NEXT !!! for the hash of each cluster found. Still reduces cpu/time drastically
#
+# 2013-03-22: Tom Kerremans: fixed warn/crit to measure returned metric, removed some obsolete notifications
###########

# core modules needed:
@@ -72,6 +72,7 @@
exit $ERRORS{'CRITICAL'};
}

+
sub isnumeric()
{
my ($x) = @_;
@@ -128,14 +129,14 @@
} #/ if hostcheck

if (defined($warn)) {
- print "WARN defined\n";
+ #print "WARN defined\n";
#if ( ! isnumeric($warn) ) { die "NOT NUMERIC \n"; }
#die "NOT NUMERIC \n" if ( ! isnumeric($warn) ) ;
die "## $warn is NOT NUMERIC \n" if $_ =~ s/[a-z]//;
}

if (defined($crit)) {
- print "CRIT defined\n";
+ #print "CRIT defined\n";
}

} #/ sub processargs
@@ -301,9 +302,9 @@
###: ELI: WTF did I put this in here for??
### DELETEME

-### FUNC: output_match
-#sub output_match {
-#my $output = shift;
+## FUNC: output_match
+sub output_match {
+my $output = shift;

# perform string regex match on retval:
# if ( "$output" =~ /.*$match.*/ ) {
@@ -314,17 +315,20 @@
# exit 2;
# }

-## perform range check for warn/crit values:
-# if ( "$output" >= "$crit" ) {
-# exit 1;
-# } elsif ( "$output" >= "$warn" ) {
-# exit 2;
-# } else {
-# exit 0;
-# }
+# perform range check for warn/crit values:
+ if ( "$output" >= "$crit" ) {
+ print "CRITICAL: $metric = $output higher than treshhold of $crit\n";
+ exit 2;
+ } elsif ( "$output" >= "$warn" ) {
+ print "WARNING: $metric = $output higher than treshhold of $warn\n";
+ exit 1;
+ } else {
+ print "OK: $metric = $output\n";
+ exit 0;
+ }

-#} #/sub
-### /FUNC: output_match
+} #/sub
+## /FUNC: output_match

#^^^^ ##: ELI: WTF did I put this in here for??
#^^^^ ## DELETEME
@@ -386,14 +390,14 @@
print "UNKNOWN: ($metric) not found in host XML! ","\n";
exit $ERRORS{'UNKNOWN'}
} else {
- print "OK: $metric = $host_metrics{$metric} \n";
- exit $ERRORS{'OK'};
+ &output_match ($host_metrics{$metric});
}
} # /unless ($metric)

} else {# /if ($hostname eq)

} # /if hostname loop through hash. We've exhausted input data, exit now:
+
} # /foreach $hostkey

# don't exit here, create exit at end of all arrays to be searched (after function exits searching the last hash)