I'm surprised by Cassandra long connection times (compared to Redis) made from the python client (cassandra-driver) to a single-node Cassandra cluster running on the same host. How can they be improved?
More info
Even though Cassandra server runs on the same host and I connect directly (through k8s/OCP service port), the connection times out until I set a relatively long triple-digits connect_timeout thresholds (relative to our standards built on 3-year experiences with Redis in several clusters).
In case of Redis the connection takes just a fraction of a millisecond, and the total (connection plus read) timeout sufficient to work in these conditions has always been 10 ms, while for Cassandra it would have to be set orders of magnitude larger!
2022-09-18 11:08:16.899097 - connecting to Redis database...
Redis<ConnectionPool<Connection<host=<redacted>.svc.cluster.local,port=<redacted>,db=0>>>
2022-09-18 11:08:16.899423 - connected to Redis database in 0.32 ms
versus:
2022-09-18 12:29:13.229688 - connecting to Cassandra database...
<cassandra.cluster.Session object at 0x7fd0065e80d0>
2022-09-18 12:29:13.409084 - connected to Cassandra database in 190.856 ms
What's even more surprising, the read timeout for Cassandra can be much shorter than connection timeout: only 16 ms for SELECT queries (with cold start, i.e. first query for a given key) vs. 128+ ms for connections alone (every time, apparently there's no connections caching on the client side).
Almost reproducible example (just fill in your cluster and db user details):
from cassandra.cluster import Cluster, PlainTextAuthProvider
# connection timeout
conn_timeout_ms = 64 # timeout
# conn_timeout_ms = 128 # OK (best time)
cas_auth_provider = PlainTextAuthProvider(username=cas_user,
password=cas_pass)
cas_cluster = Cluster(contact_points=[cas_uri],
port=cas_port,
auth_provider=cas_auth_provider,
connect_timeout=conn_timeout_ms / 1000, # ms -> sec
)
cas_session = cas_cluster.connect()
I've also tried setting the low-level TCP_NODELAY sockets option using the sockopts arg exposed by Cluster (to disable Nagle's aggregating algo, and send the data as soon as it's available), but it did not help:
sockopts = [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)]