I am having an issue with tcp keepalive under Linux, tested with Ubuntu (Linux ubuntu 5.15.0) on Desktop as well as Debian Buster and Yocto Kirkstone on embeded systems.
When a connection between client and server is established, the keepalive mechanism seems to work fine. However, when I disconnect client and server, the system does not send the configured amount of keepalive probes, but fewer.
If we reconnect client and server within the timeout, the communication is reestablished and also the keepalive probes are sent again.
I could not find anything on the net regarding this specific issue. Has anyone else encountered this behaviour before?
In the Linux docs (https://man7.org/linux/man-pages/man7/tcp.7.html), it states:
TCP_KEEPCNT (since Linux 2.4) The maximum number of keepalive probes TCP should send before dropping the connection. This option should not be used in code intended to be portable.
Does this actualy mean, that I am not able to set the exact amount of probes, only the maximum number, and the system will decide for itself on how many probes to send?
I use the socket options SO_KEEPALIVE to enable keepalive and TCP_KEEPIDLE, TCP_KEEPINTVL, and TCP_KEEPCNT to set the paramters:
void set_keep_alive(int socketfd, int idle, int intvl, int cnt)
{
int userTimeOut = 1000*(idle + intvl * (cnt - 0.5));// Timeout in ms
struct sockopt
{
int level;
int name;
int val;
};
const std::list<sockopt> sockOpts =
{
{SOL_SOCKET, SO_KEEPALIVE, 1},
// {SOL_TCP, TCP_USER_TIMEOUT, userTimeOut},
{SOL_TCP, TCP_KEEPIDLE, idle},
{SOL_TCP, TCP_KEEPINTVL, intvl},
{SOL_TCP, TCP_KEEPCNT, cnt}
};
for(const auto& opt : sockOpts)
{
if(setsockopt(socketfd, opt.level, opt.name, &opt.val, sizeof(opt.val)) < 0)
{
close(socketfd);
exit(EXIT_FAILURE);
}
}
}
To monitor the packages, I am using wireshark.
Both TCP_KEEPIDLE and TCP_KEEPINTVL are honored, i.e.:
- When the connection is idle, both server and client will send probes using the TCP_KEEPIDLE time
- When client and server are disconnected (and no ACKs are received), the TCP_KEEPINTVL time is used
However, the system sometimes stops sending probes before the TCP_KEEPCNT value is reached, depending on the keepalive parameters. The connection itself remains opened until the timeout is reached.
For example:
- TCP_KEEPIDLE=5, TCP_KEEPINTVL=3, TCP_KEEPCNT=10 -> Only 7 probes are sent after last ACK
- TCP_KEEPIDLE=5, TCP_KEEPINTVL=2, TCP_KEEPCNT=10 -> Only 5 probes are sent after last ACK
In both cases, I would expect 10 probes to be sent.
The same happens when I set the TCP_USER_TIMEOUT: After server and client are disconnected, it will stop sending probes prematurely, but the connection itself is kept open until the timeout is reached.