How to handle timeouts with AWS SSM port forwarding sessions?

540 Views Asked by At

I have a Python script that inserts data into an RDS MySQL instance. The instance is non-public-facing. I've followed this AWS article on how to "Securely connect to an Amazon RDS or Amazon EC2 database instance remotely with your preferred GUI" as a guide so that I can connect to this instance from my local machine using ssm to proxy the connection. I'm using a script, not a GUI, but the goals seem the same and I find this article works for me.

That article seems to work for my needs except for one significant snag. The ssm connection times out when using AWS-StartPortForwardingSessionToRemoteHost. The script needs hours to run. There are a lot of records to process and insert. It runs for a while, then the ssm connection dies and I have to restart things. This is the ssm message logged when the connection times out.

Session: <the session id> timed out.

I do not believe this is a matter of exceeding throughput limits. I'm not hitting ssm endpoints from the application itself. I don't think I'd be hitting limits in that sense. I think the Idle time before session termination is a more likely culprit...

Idle time before session termination       Default: 20 minutes
                                           Configurable to between 1 and 60 minutes.

...but I did try to increase the idle timeout to 60 minutes and that doesn't seem to change anything. I seem to hit the timeout issue with AWS-StartPortForwardingSession as well. The only thing in my testing that seems immune to these connectivity timeouts is a basic start-session with no --document-name or additional parameters. That specific type of connection seems to stay alive longer than the others at least (I haven't seen the connection drop yet), but it doesn't really solve my problem.

Although I'm sure there are different approaches I could take here I'm most curious about why these timeouts are occurring.

Assuming this is an issue with idle time, then I don't understand how idle time is being defined. I would think the script I'm running is never idle. When I log from the script I see it's not locking up or failing. It's running as expected. I even tried a hack bash script in another tab to cancel out any "idleness".

while true; do mysql -u <the user> -h '127.0.0.1' -p'<the password>' <the db> -e 'show databases;'; sleep 10; done

I feel like I'm missing something here. The article I linked to implies (to me at least) that ssm is well-suited to proxy from a local machine to RDS for arbitrary work, but maybe that's not the case here. Is there any way to diagnose and fix these ssm timeouts?

1

There are 1 best solutions below

0
On

This doesn't answer the question of why these timeouts are happening, but maybe provides some data for a better answer that goes beyond speculation. The following works as expected and does not seem to hang up/time out. At least, not in testing so far.

Start a port forwarding session

This is not a port forwarding session to a remote host, but rather a simple port forwarding session.

aws ssm start-session \
    --region us-east-2 \
    --target "<ec2 instance id>' --output text)" \
    --document-name AWS-StartPortForwardingSession \
    --parameters '{"portNumber":["22"], "localPortNumber":["2222"]}'

Send SSH key to the EC2 host

aws ec2-instance-connect send-ssh-public-key \
    --region us-east-2 \
    --instance-id "<ec2 instance id>" \
    --availability-zone us-east-2a \
    --instance-os-user ec2-user \
    --ssh-public-key file://~/.ssh/id_ed25519.pub

Start forwarding port 3306 traffic over SSH

ssh \
    -i ~/.ssh/id_ed25519 \
    -L 3306:<rds remote endpoint>:3306 \
    -p 2222 \
    -o ServerAliveInterval=1 \
    ec2-user@localhost

Connect to the MySQL proxy

mysql -u <user> -h '127.0.0.1' -p'<password>' --port 3306 <the db>

This does not explain why AWS-StartPortForwardingSessionToRemoteHost seems to drop the connection, and is a workaround, which the original question is trying to avoid.