I have a number of servers that I regularly scan with ipmitool - that is, I run:
timeout 5 ipmitool -I lanplus -H some.host.name -U mickey -P mouse mc info
against them. However, I have a strange problem, or it seems strange to me: quite often some of them will fail to respond, either because timeout kills them, or because ipmitool itself is rejected by the BMC - but which servers fail changes every time. I have had as many as half fail, only to have them all succeed a few minutes later.
What can possibly explain this?
Underneath the covers the IPMI protocol is a simple UDP based protocol. Any messages that are dropped/lost/slow will eventually have to be detected with a timeout and re-transmitted. It appears (via source code review) that the default timeout in ipmitool is 2 seconds. So just a few slow packets can easily hit your 5 second timeout.
I don't know how busy your network is, but it's not unusual to get the occasional timeout.