Getting AllNodesFailedException after exposing K8ssandra with service type LoadBalancer

326 Views Asked by At

I am using k8ssandra to deploy cassandra in EKS cluster. I wanted to use cassandra from outside of the kubernetes cluster. To do that I expose the port with Service type LoadBalancer that has the following menifest

apiVersion: v1
kind: Service
metadata:
  name: k8ssandra-lb
  namespace: databases
spec:
  type: LoadBalancer
  ports:
    - name: native
      port: 9042
      protocol: TCP
      targetPort: 9042
    - name: tls-native
      port: 9142
      protocol: TCP
      targetPort: 9142
    - name: mgmt-api
      port: 8080
      protocol: TCP
      targetPort: 8080
    - name: prometheus
      port: 9103
      protocol: TCP
      targetPort: 9103
    - name: thrift
      port: 9160
      protocol: TCP
      targetPort: 9160
  selector:
    cassandra.datastax.com/cluster: k8ssandra
    cassandra.datastax.com/datacenter: us-east-1

Although I am able to able to access the management api but if I tried to connect to 9042 port with the datastax cassandra driver I am geting AllNodesFailedException and Protocol initialization request timout exception. The full stacktrace is given below.

com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): 
Node(endPoint=***.us-east-1.elb.amazonaws.com/****:9042, hostId=null, hashCode=e9fc992): 
[com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|id: 0xdf8ab203, L:/172.28.3.154:61842 - R:****.us-east-1.elb.amazonaws.com/****:9042]
 Protocol initialization request, step 2 (STARTUP {CQL_VERSION=3.0.0, DRIVER_NAME=DataStax Java driver for Apache Cassandra(R), DRIVER_VERSION=4.13.0, CLIENT_ID=86cb6c92-894c-48b9-a99d-439433cf0bfe}): 
 unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Unexpected error on channel)]
    at com.datastax.oss.driver.api.core.AllNodesFailedException.copy(AllNodesFailedException.java:141)
    at com.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:149)
    at com.datastax.oss.driver.api.core.session.SessionBuilder.build(SessionBuilder.java:835)
    at ****.CassandraUtils.buildCqlSession(CassandraUtils.java:44)
    at ****.CassandraStore.createMainSession(CassandraStore.java:33)
    at ****.CassandraStore.<init>(CassandraStore.java:23)
    at ****.Main.main(Main.java:22)
    Suppressed: com.datastax.oss.driver.api.core.connection.ConnectionInitException: 
    [s0|control|id: 0xdf8ab203, L:/172.28.3.154:61842 - R:****.us-east-1.elb.amazonaws.com/****:9042] 
    Protocol initialization request, step 2 (STARTUP {CQL_VERSION=3.0.0, DRIVER_NAME=DataStax Java driver for Apache Cassandra(R), DRIVER_VERSION=4.13.0, CLIENT_ID=86cb6c92-894c-48b9-a99d-439433cf0bfe}): 
    unexpected failure (com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Unexpected error on channel)
        at com.datastax.oss.driver.internal.core.channel.ProtocolInitHandler$InitRequest.fail(ProtocolInitHandler.java:356)
        at com.datastax.oss.driver.internal.core.channel.ChannelHandlerRequest.onFailure(ChannelHandlerRequest.java:104)
        at com.datastax.oss.driver.internal.core.channel.InFlightHandler.fail(InFlightHandler.java:381)
        at com.datastax.oss.driver.internal.core.channel.InFlightHandler.abortAllInFlight(InFlightHandler.java:371)
        at com.datastax.oss.driver.internal.core.channel.InFlightHandler.abortAllInFlight(InFlightHandler.java:353)
        at com.datastax.oss.driver.internal.core.channel.InFlightHandler.exceptionCaught(InFlightHandler.java:299)
        at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
        at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
        at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1377)
        at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
        at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
        at io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:907)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:125)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:177)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
    Caused by: com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Unexpected error on channel
    Caused by: java.io.IOException: Operation timed out
        at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276)
        at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:233)
        at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:223)
        at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:356)
        at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
        at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)

Thank you so much for your help.

I tried increasing the timeout to 60 seconds still getting the error when building the CqlSession. Since I am able to connect the management port, so I think the exposing port should be working fine. Also telnet is able to connect to the host with the port 9042

telnet ****.us-east-1.elb.amazonaws.com 9042
Trying ****.236.60...
Connected to ****.us-east-1.elb.amazonaws.com.
1

There are 1 best solutions below

2
On

As the error states, the driver is not able to reach any of the nodes in the cluster so it's a networking issue.

I have noted that you've exposed both CQL ports 9042 and 9142 (for encrypted connections). If you've enabled client-to-node encryption on the cluster then you won't be able to connect to any nodes on port 9042. You will need to configured the driver to use port 9142. Cheers!