Could not complete SSL handshake, Socket timeout after 10 seconds with Nagios and NSClient++ 0.4x

Adding a couple of new Windows hosts to my monitoring network this morning, my NRPE plugin checks against them were failing.

me@nagios:~$ /usr/lib/nagios/plugins/check_nrpe -H monitoredwindowsserver.vpn
CHECK_NRPE: Error - could not complete SSL handshake.

The usual culprits – password set but not being used, or hosts_allowed not including the host doing the checking – were already set correctly.

Turns out the NSClient++ folks changed up the default configs. In order to get it working again with a relatively vanilla Nagios server on the other end, I needed to set two new directives under [/settings/NRPE/server]:

verify mode = none
insecure = true

With those directives set and a restart to the nsclient service on the Windows end, manual tests to NRPE worked properly:

me@nagios:~$ /usr/lib/nagios/plugins/check_nrpe -H monitoredwindowsserver.vpn
I ( 2015-12-08) seem to be doing fine...

One of these days I should figure out how to get the default modes, which use peer-to-peer certificate checks, working… but for the moment, I’m only allowing traffic over a VPN tunnel anyway, so it was more important to get it working in its existing (secured by VPN) configuration than to blow a few hours untangling the new defaults.

Wasn’t out of the woods yet, though – that got NRPE working, but not NSClientServer, which is what I’m actually using to monitor these Windows hosts for the most part. So I was still seeing “CRITICAL – Socket timeout after 10 seconds” on a lot of tests against the new hosts in Nagios. Doing a netstat -an on the Windows hosts themselves showed that they were listening on 5666 – the NRPE port – but nothing was listening on 12489, the NSClientServer port. This required another fix in the nsclient.ini file. Just underneath [/modules], you’ll need to add (not just uncomment!) this line:

NSClientServer = enabled

And restart the NSClient++ service again, either from Services applet or with net stop nscp ; net start nscp. Now we test the check_nt plugin against the host…

me@nagios:~$ /usr/lib/nagios/plugins/check_nt -H monitoredwindowsserver.vpn -v UPTIME -p 12489
System Uptime - 0 day(s) 1 hour(s) 34 minute(s)

Now, finally, all of my tests are working. Hope this saves somebody else from having the kind of morning I had!

Published by

Jim Salter

Mercenary sysadmin, open source advocate, and frotzer of the jim-jam.

Leave a Reply

Your email address will not be published. Required fields are marked *