I have a pretty simple AWS Lambda function in which I connect to an Amazon Keyspaces for Cassandra database. This code in Python works, but from time to time I get the error. How do I fix this strange behavior? I have an assumption that you need to make additional settings when initializing the cluster. For example, set_max_connections_per_host. I would appreciate any help.
ERROR:
('Unable to complete the operation against any hosts', {<Host: X.XXX.XX.XXX:XXXX eu-central-1>: ConnectionShutdown('Connection to X.XXX.XX.XXX:XXXX was closed')})
lambda_function.py:
import sessions
cassandra_db_session = None
cassandra_db_username = 'your-username'
cassandra_db_password = 'your-password'
cassandra_db_endpoints = ['your-endpoint']
cassandra_db_port = 9142
def lambda_handler(event, context):
global cassandra_db_session
if not cassandra_db_session:
cassandra_db_session = sessions.create_cassandra_session(
cassandra_db_username,
cassandra_db_password,
cassandra_db_endpoints,
cassandra_db_port
)
result = cassandra_db_session.execute('select * from "your-keyspace"."your-table";')
return 'ok'
sessions.py:
from ssl import SSLContext
from ssl import CERT_REQUIRED
from ssl import PROTOCOL_TLSv1_2
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.policies import DCAwareRoundRobinPolicy
def create_cassandra_session(db_username, db_password, db_endpoints, db_port):
ssl_context = SSLContext(PROTOCOL_TLSv1_2)
ssl_context.load_verify_locations('your-path/AmazonRootCA1.pem')
ssl_context.verify_mode = CERT_REQUIRED
auth_provider = PlainTextAuthProvider(username=db_username, password=db_password)
cluster = Cluster(
db_endpoints,
ssl_context=ssl_context,
auth_provider=auth_provider,
port=db_port,
load_balancing_policy=DCAwareRoundRobinPolicy(local_dc='eu-central-1'),
protocol_version=4,
connect_timeout=60
)
session = cluster.connect()
return session
This is the biggest issue which I see:
The code looks fine, but I don't see a
WHEREclause. So if there's a lot of data, a single node (chosen as a coordinator) will have to build the result set while pulling data from all other nodes. As this results in (un)predictibly bad performance, that could explain why it works sometimes, but not others.Pro-tip: All queries in Cassandra should have a
WHEREclause.