Home Directory Plugins Clustering and High-Availability check_hadoop_dfs.pl (Advanced Nagios Plugins Collection)

Search Exchange

Search All Sites

Nagios Live Webinars

Let our experts show you how Nagios can help your organization.

Contact Us

Phone: 1-888-NAGIOS-1
Email: sales@nagios.com

Login

Remember Me

Directory Tree

check_hadoop_dfs.pl (Advanced Nagios Plugins Collection)

Rating
0 votes
Favoured:
0
Compatible With
  • Nagios 1.x
  • Nagios 2.x
  • Nagios 3.x
  • Nagios XI
Hits
30202
Nagios CSP

Meet The New Nagios Core Services Platform

Coming Soon...

Built on over 25 years of monitoring experience, the Nagios Core Services Platform provides insightful monitoring dashboards, time-saving monitoring wizards, and unmatched ease of use. Use it for free indefinitely.

Various HDFS checks based on dfsadmin -report, unified version of various HDFS plugins I've written over the years
Part of the Advanced Nagios Plugins Collection, download it here:

https://github.com/harisekhon/nagios-plugins

./check_hadoop_dfs.pl --help

Nagios Hadoop Plugin to check various health aspects of HDFS via the Namenode's dfsadmin -report

- checks % HDFS space used. Based off an earlier plugin I wrote in 2010 that we used in production for over 2 years. This heavily leverages HariSekhonUtils so code in this file is very short but still much tighter validated
- checks HDFS replication of blocks, again based off another plugin I wrote in 2010 around the same time as above and ran in production for 2 years. This code unifies/dedupes and improves on both those plugins
- checks HDFS % Used Balance is within thresholds
- checks number of available datanodes and if there are any dead datanodes

Originally written for old vanilla Apache Hadoop 0.20.x, updated for CDH 4.3 (2.0.0-cdh4.3.0)

Recommend you also investigate check_hadoop_cloudera_manager_metrics.pl (disclaimer I work for Cloudera but seriously it's good it gives you access to a wealth of information)

usage: check_hadoop_dfs.pl [ options ]

-s --hdfs-space Checks % HDFS Space used against given warning/critical thresholds
-r --replication Checks replication state: under replicated blocks, corrupt blocks, missing blocks. Warning/critical thresholds apply to under replicated blocks. Corrupt and missing blocks if any raise critical since this means there is potentially data loss
-b --balance Checks Balance of HDFS Space used % across datanodes is within thresholds. Lists the nodes out of balance in verbose mode
-n --nodes-available Checks the number of available datanodes against the given warning/critical thresholds as the lower limits (inclusive). Any dead datanodes raises warning
-w --warning Warning threshold or ran:ge (inclusive)
-c --critical Critical threshold or ran:ge (inclusive)
--hadoop-bin Path to 'hdfs' or 'hadoop' command if not in $PATH
--hadoop-user Checks that this plugin is being run by the hadoop user (defaults to 'hdfs', falls back to trying 'hadoop' unless specified)
-h --help Print description and usage options
-t --timeout Timeout in secs (default: 10)
-v --verbose Verbose mode
-V --version Print version and exit