check_megaraid_sas nagios plugin explanation

141 Views Asked by At

Can anyone explain following piece of code in https://github.com/simondeziel/custom-nagios-plugins/blob/master/plugins/check_megaraid_sas . (line num 220-223) Why this code is there

} elsif ( $slotnumber != 255 ) {
            $pdbad++;
            $status = 'CRITICAL';
        }
1

There are 1 best solutions below

1
On

It makes sense to look at the complete section:

PDISKS: while (<PDLIST>) {
        if ( m/Slot Number\s*:\s*(\d+)/ ) {
            $slotnumber = $1;
            $pdcount++;
        } elsif ( m/(\w+) Error Count\s*:\s*(\d+)/ ) {
            if ( $1 eq 'Media') {
                $mediaerrors += $2;
            } else {
                $othererrors += $2;
            }
        } elsif ( m/Predictive Failure Count\s*:\s*(\d+)/ ) {
            $prederrors += $1;
        } elsif ( m/Firmware state\s*:\s*(\w+)/ ) {
            $fwstate = $1;
            if ( $fwstate eq 'Hotspare' ) {
                $hotsparecount++;
            } elsif ( $fwstate eq 'Online' ) {
                # Do nothing
            } elsif ( $fwstate eq 'Unconfigured' ) {
                # A drive not in anything, or a non drive device
                $pdcount--;
            } elsif ( $slotnumber != 255 ) {
                $pdbad++;
                $status = 'CRITICAL';
            }
        }
} #PDISKS

That section loops over a list of PDs (Primary Disks?), and I assume that this file / program output contains a human readable status for every attached device. The code looks at every line and performs some actions depending on the content of that line:

$slotnumber is assigned whenever there is Slot Number : ... in the contents of PDLIST. From looking at the logic, if there is a Firmware state line that is not Hotspare, Online or Unconfigured, and the $slotnumber is not 255, then something went horribly wrong and the status is considered CRITICAL. The number of bad PDs ($pdbad) is then increased by one.