Using check_by_ssh in Nagios Yielding Strange Behavior (and Remote Execution Failure)

4.5k Views Asked by At

I'm new to Nagios, and I've been trying to get Nagios to handle a few simple check_by_ssh commands. I'm at the point where I'm successfully able to run the command from the command line like so:

#/usr/local/nagios/libexec/check_by_ssh -H HERP.DERP.COM -C "/home/derrp/bin/...
 check_disk -w 50 -c 10 -A"

Which presents

DISK OK - free space: blah blah blah

So, that's good; it works o.k. from the command line. However, when I throw that into my commands.cfg file (using macros $USER1$ and $HOSTADDRESS$ at first, though the literal information yields the same results), and check Nagios' web interface to verify, it tells me

Remote command execution failed: ssh_askpass: exec(/usr/bin/ssh-askpass): 
No such file or directory

I've ensured ssh-askpass is installed. What gives?

3

There are 3 best solutions below

0
On

You have to set up ssh keys so that the 'nagios' user can ssh to the remote host(s) without a passphrase on the private key, or it cannot run checks using check_by_ssh.

Until you can run (as the nagios user) something like this...

ssh [nagios@]remotehost.example.com /path/to/your/plugins/check_procs

... you do not have keys set up correctly.

0
On

For some reason, adding the "-E" flag fixed it. According to the check_by_ssh man page, this is the ignore STDERR flag. Now I get the output from check_raid.

Final command:

$USER1$/check_by_ssh -i ~nagios/.ssh/id_dsa -H $HOSTADDRESS$ -t 60 -l -l root -o StrictHostKeyChecking=no -o ConnectTimeout=15 -o BatchMode=yes -o ServerAliveCountMax=3 -o ServerAliveInterval=10 -C "/usr/local/libexec/nagios/check_raid" -E

0
On

I had a similar problem - running

sudo -u nagios check_by_ssh .... 

helped - I had wrong permissions on the private key. But sudo was essential here - what works ran as root, doesn't necessarily work as nagios.