I am using grpc-okhttp on an Android application for RPC calls to a backend.
This is the client side keep-alive configuration:
.keepAliveTime(2, TimeUnit.SECONDS)
.keepAliveTimeout(5, TimeUnit.SECONDS)
.keepAliveWithoutCalls(true)
I have observed cases where the connection dies at some point but is never restored until app restart.
I have not found a way to consistently reproduce it yet but in production I'm seeing a lot of errors "UNAVAILABLE: Keepalive failed. The connection is likely gone".
From my understanding gRPC should reconnect automatically but it looks like once the keep-alive fails once, it never even bothers to reconnect - following requests fail immediately, not even waiting for a deadline or keep-alive timeout.
This is likely caused by delayed/failed discovery of network state changes on Android devices. gRPC provides
AndroidChannelBuilder
, which is trying to address this problem particularly. It uses Android ConnectivityManager to receive network state updates and can respond more quickly to network changes.