Search All Sites
This document describes the work of design, development and improvement of the Nagios monitoring system done in Cineca and used for the Tier-1 systems participating in the PRACE projects. Starting from the issues arisen by the complexity of the HPC systems and the related monitoring activities, the targeted solutions and their implementation are explained. The most important aspects of the implementation and the specific issues related to HPC will be described with a specific attention to the exascale clusters.
Reviews (0)Be the first to review this listing!