I have some clients and a server on windows (I am using winsock for both client and server), the server is waiting for clients to get connected to it via tcp, while waiting for almost 5 minutes after establishing connections with the incoming clients and waiting for others, the connection is getting closed from the server side, and recv is returning error code 10053, on the client side recv is returning 0. I tried to set SO_KEEPALIVE option in both server and client, but it is not working. the server is hoted on a VM on AWS.
Here is the server code: for turning on socket option in listening socket
struct addrinfo hints1;
memset(&hints1, 0, sizeof(hints1));
hints1.ai_family = AF_INET;
hints1.ai_socktype = SOCK_STREAM;
hints1.ai_flags = AI_PASSIVE;
struct addrinfo* bind_addres;
int port2 = port + 3;//8081+3=8084 is the port of the tcp connection
char port2_str[10];
sprintf(port2_str, "%d", port2);
getaddrinfo(0, port2_str, &hints1, &bind_addres);//this server is running on port 8080
SOCKET tcp_socket = socket(bind_addres->ai_family, bind_addres->ai_socktype, bind_addres->ai_protocol);
if (!ISVALIDSOCKET(tcp_socket))
{
cout << "\n socket not created=>" << GETSOCKETERRNO();
}
int enableKeepAlive = 1;
setsockopt(tcp_socket, SOL_SOCKET, SO_KEEPALIVE, (const char*)&enableKeepAlive, sizeof(enableKeepAlive));
cout << "\n binding the socket==>";
if (bind(tcp_socket, (const sockaddr*)bind_addres->ai_addr, (int)bind_addres->ai_addrlen))
{
cout << "\n failed to bind the socket==>" << GETSOCKETERRNO();
}
here is the server code for turning on the socket option in the connected sockets:
struct sockaddr_storage client_address;
socklen_t client_len = sizeof(client_address);
SOCKET client = accept(tcp_socket, (sockaddr*)&client_address, &client_len);
int enableKeepAlive = 1;
setsockopt(client, SOL_SOCKET, SO_KEEPALIVE, (const char*)&enableKeepAlive, sizeof(enableKeepAlive));
the code at the client side for socket is:
struct addrinfo hints;
memset(&hints, 0, sizeof(hints));
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
struct addrinfo* bind_address;
int port2 = port + 3;
char port2_str[10];
sprintf(port2_str, "%d", port2);
//cout << "\n port of the server2 is==>" << port2_str;
getaddrinfo(ip_address.c_str(),port2_str, &hints, &bind_address);
cout << "\n tcp connection port is=>" << port2_str;
SOCKET tcp_socket = socket(bind_address->ai_family, bind_address->ai_socktype, bind_address->ai_protocol);
if (!ISVALIDSOCKET(tcp_socket))
{
fprintf(stderr, "socket() failed. (%d)\n", GETSOCKETERRNO());
//return 1;
}
int enableKeepAlive = 1;
setsockopt(tcp_socket, SOL_SOCKET, SO_KEEPALIVE, (const char*)&enableKeepAlive, sizeof(enableKeepAlive));
After turning of SO_KEEPALIVE, the connection is getting broken after almost 5 minutes, the connection is getting disconnected at the server side.
The windows firewall may be on, on the host maching for server. and the server is hosted on AWS VM. Now how should I keep the connection alive?
There are two things to consider, at least.
First, SO_KEEPALIVE, with default timings, doesn't even kick in until 2 hours have passed with no activity. Yes, that's right -- it's two HOURS, not minutes, not seconds.
Oftentimes, other entities in the connection will drop the connection much quicker than that.
Second, keepalive works on the TCP level. Applications usually have their own timeouts -- they don't want a connection to remain open when nothing is happening.
In general, protocols that have valid use cases for having an open but seemingly idle connection might implement an "heartbeat"-kind of scheme: they will still ditch connections where nothing is happening (to catch network failures or crashed clients that still have an open connection), but support a special "heartbeat"/"ping" packet/command that effectively does nothing, but you can send it every few seconds or so, to prevent the connection from being "idle".
If such a heartbeat is not defined in the protocol, chances are the protocol designers didn't see a need for it.