check_prtdiag is a Perl script designed to parse Solaris systems prtdiag output, and raise errors whenever certain criterias are not met. It depends on a configuration file, check_prtdiag.conf, whose content will define what will be checked, and how it will be checked. I. Configuration file basics This configuration file is made of "sections" defining systems and associated checks. These sections contain "declarations". Empty lines and comments (from the '#' sign to the end of the line) are ignored. Sections are defined by a line beginning with one or more words inside brackets. Spaces after the opening bracket and before the closing bracket of sections are ignored : [this_is_a_section] [ this is another section ] Declarations are defined by "key = value" pairs. Spaces preceding and following the '=' character of declarations are ignored. The 'value' part of the declaration can contain a '=' character, the key part cannot : this is a key = this is its value another_key=another_value the Key part = and the Value part with a '=' character inside Declarations found outside of a section are ignored. Everything not matching a "section" or "declaration" is ignored. Spaces at both the beginning and end of lines are ignored. II. The 'commands' section The special 'commands' section is used by the check_prtdiag script to locate the prtdiag command on the system : [commands] platform = /sbin/uname -i prtdiag = /usr/platform/CMD(platform)/sbin/prtdiag -v This will define a "platform" command, and a "prtdiag" command. The macro CMD(platform) tells check_prtdiag to expand CMD(platform) to the result of the '/sbin/uname -i' command (will fail if the command execution does not exit with a zero status). III. Other sections Every other section found in the configuration file will be treated as a "system definition". Example : [SunFire 280R] This will define a "SunFire 280R" system. Note that this name is not really used, you may have called it "foobar" too. A valid system definition requires at least two declarations : - a "system.match" declaration - a "system.checks" declaration 1. The "system.match" declaration The content of this declaration is a Perl regular expression, which will be tested against prtdiag output. A positive match will instruct check_prtdiag to use the "system.checks" declaration of the section where this declaration was found. Example : system.match = ^System Configuration:.*Sun Fire 280R If check_prtdiag finds a line matching the "begins with the 'System Configuration:' string, followed by anything, followed by the 'Sun Fire 280R' string" pattern, then it will use the current section (here the "SunFire 280R" section) as its base for every other declaration it needs. 2. The "system.checks" declaration The content of this declaration is a comma separated list of checks to use. Example : system.checks = Leds,Fans,Disks,PSU This will instruct check_prtdiag to use the "Leds", "Fans", "Disks" and "PSU" checks defined later in the section. 3. The "checks" declarations A valid check definition requires at least seven declarations : - a "checks..description" declaration - a "checks..begin_match" declaration - a "checks..end_match" declaration - a "checks..data_match" declaration - a "checks..data_labels" declaration - a "checks..ok_condition" declaration - a "checks..output_string" declaration Optionaly, a check definition may also contain : - a "checks..skip_match" declaration - a "checks..fetch_mode" declaration - a "checks..data_match_regsep" declaration 3.1 The "checks..description" declaration It is a string that will be used as the description of the test in verbose mode. Example : checks.Boards.description = IO cards status This will produce a "Checking IO cards status:" message when "check_prtdiag -v" will run the "Boards" check . 3.2 The "checks..begin_match" declaration This is a regular expression that will be used to instruct check_prtdiag, as soon as it matches the prtdiag output, that it can try to collect data for the specified checks. Example : checks.Boards.begin_match = ^=+\sIO Cards This will instruct check_prtdiag to wait for a line matching the "starts with one or more '=' sign, followed by a space character and the 'IO Cards' string" pattern, before trying to collect and analyze data for the 'Boards' checks. Output of prtdiag following this pattern will be used to collect data, until the checks.Boards.end_match pattern is matched. Note that data collection begins immediately : the matching part of the specified pattern is removed, and the remaining data is tested among the data_match pattern. If you don't want this to happen, use a whole line match, like this : checks.Boards.begin_match = ^Disk LED Status*$ 3.3 The "checks..end_match" declaration This is a Perl regular expression that will be used to instruct check_prtdiag, as soon as it matches the prtdiag output, stopping data collection for the specified checks. Example : checks.PowerSupplies.end_match = ^= This will instruct check_prtdiag to stop data collection for the 'PowerSupplies' check as soon as prtdiag output matches the "line beginning with a '=' sign" pattern. 3.4 The "checks..data_match" declaration This is a Perl regular expression that will be used by check_prtdiag to extract data from prtdiag output, for the specified checks. Example : checks.Memory.data_match = ^\s*(\d+)\s+\S+\s+(.*?)\s+.*?(\S+)$ This will instruct check_prtdiag to collect data when prtdiag output is a line matching the "starts with zero or more spaces followed by one or more numbers, followed by one ore more spaces, followed by one or more alphanumerical characters, followed by one or more spaces, followed by anything, followed by one or more spaces, followed by anything, and ending by one or more alphanumerical characters" pattern. The data collected, according to Perl regular expression syntax, will be : - The number(s) found at the beginning of the line : "(\d+)" - The "anything" part between spaces after the alphanumerical character(s) : "(.*?)" - The alphanumerical characters found at the end of the line : "(\S+)" ATTENTION : When the optional "checks..fetch_mode" declaration is set to "linear", the "checks..data_match" declaration is a list of regular expressions. Unless the optional "checks..data_match_regsep" declaration is set, the default separator used is the comma. 3.5 The "checks..data_labels" declaration This is a comma separated list of labels used to name the data collected. Example : checks.Fans.data_labels = Bank,Status Assuming : checks.Fans.data_match = ^(\S+)\s+\[\s*(\S+)\s*\] The collected values will be named 'Bank' and 'Status' respectively. 3.6 The "checks..ok_condition" declaration This is a Perl expression that will be eval()'ed to test for a OK condition. To refer to values collected, use their labels surrounded by '%' characters. Example : checks.Leds.ok_condition = not( ( "%Location%" =~ m/FAULT/i ) and ("%Status%" eq "ON") ) Condition will be checked unless the 'Location' field content matches the 'FAULT' word, and the corresponding 'Status' field data is the "OK" string. 3.7 The "checks..output_string" declaration This is the string that will be used by check_prtdiag to output results associated to the checks. To refer to values collected, use their labels surrounded by '%' characters. Example : checks.FRU.output_string = FRU '%Location%' status is '%Status%' When checking the FRU statuses, this will output results like these : FRU 'PS0' status is 'OK' FRU 'PS1' status is 'FAULT' 3.8 The "checks..skip_match" declaration This is a Perl regular expression that is used to instruct check_prtdiag not to process data collection when prtdiag output matches it, for the specified checks. Example : checks.FRU.skip_match = ^Location This will tell check_prtdiag to ignore prtdiag output line matching the "begins with the 'Location' string" pattern. It is useful to ignore labels. 3.9 The "checks..fetch_mode" declaration This is a value that tells check_prtdiag how to fetch data. For now, the only recognized value is "linear". Any other value will be ignored. Most of prtdiag output can be treated as a regular array : Interlv. Socket Size Bank Group Name (MB) Status ---- ----- ------ ---- ------ 0 0 1901 256 OK 0 0 1902 256 OK 0 0 1903 256 OK 0 0 1904 256 OK 1 0 1801 256 OK But some output cannot : System LED Status: DISK ERROR POWER [OFF] [ ON] POWER SUPPLY ERROR ACTIVITY [OFF] [ ON] GENERAL ERROR THERMAL ERROR [OFF] [OFF] Setting the "checks..fetch_mode" declaration to "linear" will instruct check_prtdiag that data collection must be done line after line. In this mode, the "checks..data_match" declaration is a list of regular expressions. Unless you specify it through the "checks..data_match_regsep" declaration, the default separator is the comma character (','). Note that you must define as many regular expressions as data labels. Example : checks.Leds.begin_match = ^System LED Status:\s+ checks.Leds.fetch_mode = linear checks.Leds.data_match = ((?:\S+\s)*\S+),\[\s*(.*?)\s*\] checks.Leds.data_labels = Location,Status When fetching data in "linear" mode, while reading the first line of data, values matching the first pattern defined in the data_match declaration are associated to the first label defined in the data_labels declaration, then while reading the second line of data, values matching the second pattern defined in the data_match declaration are associated to the second label defined in the data_labels declaration, and so on ... 3.10 The "checks..data_match_regsep" declaration This is a Perl regular expression that is used to specify an alternate separator. Defaults to "\s*,\s*". Example : checks.Leds.fetch_mode = linear checks.Leds.data_match = ((?:\S+\s)*\S+);\[\s*(.*?)\s*\] checks.Leds.data_match_regsep = ; IV. Notes about Perl regular expressions 4.1 Special characters The following characters have a special meaning in Perl : {,},(,),.,+,*,[,],$,^,?,/,\ If you want to match one of these, you'll have to escape them with a backslash '\' character. Example : checks.Fans.data_match = ^(\S+)\s+\[\s*(\S+)\s*\] In this case, the second collected value will be surrounded by brackets. 4.2 Restricting pattern Perl regular expressions default to "hungry" mode, which means that a given pattern will try to include as much data as it can. This is why you may want to tell perl to restrict the match to its minimum by adding a '?' character after the matching part. Example : checks.Disks.data_match = ^(.*?\d+).*?\[\s*(\S+)\s*\]\s*$ In this case the first ".*" match will stop as soon as it encounters a digit. 4.3 Grouping patterns In certain cases, you'll want to use parenthesis to group patterns, but do not want Perl to consider this data as another value to collect. You can use the '?:' operator to tell Perl that you're not interested by this part. Example : checks.Boards.data_match = ^(No failures found in System|(?:No|Detected) System Faults) checks.Boards.ok_condition = "%Diagnosis%" =~ m/^(No) / With the '?:' operator, %Diagnosis% will contain either : "No failures found in System" "No System Faults" "Detected System Faults" Without the '?:' operator, %Diagnosis% would have contained either : "No failures found in System" " System Faults" In this case, the ok_condition would pass even if prtdiag output matches a "Detected System Faults" message ! For more information about Perl regular expressions, please have a look at http://perldoc.perl.org/perlretut.html.