Lost network connection from long running SQL statement in MySQL Docker container

771 Views Asked by At

I have a long running SQL statement that times out when I run it against an AWS RDS instance from inside a local Docker container.

I have a .env file that defines some environment variables needed to run the command.

I get into the Docker container as such:

docker run -i -t --env-file .env mysql:8.0.20 /bin/bash

Once inside the container, I run a long running SQL command that consistently times out after about 5 minutes.

mysql \
    mydatabase \
    --host=${MYSQL_RDS_HOST} \
    --port=3306 \
    --user=${MYSQL_USER} \
    --password=${MYSQL_PASSWORD} << EOF
        INSERT INTO copy_of_really_big_table (
            id, columna, columnb, columnc
        )
        SELECT id, columna, columnb, columnc FROM really_big_table;
EOF

The error I receive is ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query.

I can see the query running on the RDS instance using show processlist so I know the query is getting through to the RDS. Once the timeout occurs, the query continues to run through to completion.

I have tried:

  • Supplying --reconnect on the mysql command line
  • Bumping up net_read_timeout in MySQL
  • Bumping up net_write_timeout in MySQL
  • Bumping up connection_timeout in MySQL
  • Bumping up wait_timeout in MySQL
  • Bumping up interactive_timeout in MySQL

Although I suspect that was all for naught since I can run the command from my desktop (outside of the Docker container). It takes about an hour, but runs to completion no problem.

It also runs successfully if I run the SQL statement from inside a local Docker container against another local MySQL Docker container.

I have replicated this issue on both Windows and Mac. The MySQL RDS version is 8.0.20. The Docker container MySQL is also 8.0.20.

I am really hoping I just have something configured incorrectly on my local container. Is there a network timeout setting? Or maybe I don't have something routing correctly? Or perhaps there's an RDS setting I'm not seeing?

Any help would be appreciated, especially hints on how one might actually debug such an issue. Thank you!

1

There are 1 best solutions below

0
On BEST ANSWER

Tweaking net.ipv4.tcp_keepalive_time resolved this issue for me. The default is 7200 seconds, but the vpnKitMaxPortIdleTime in Docker for Mac is 300 seconds by default which allowed ports to time out.

docker run -i -t --env-file .env --sysctl net.ipv4.tcp_keepalive_time=150 mysql:8.0.20 /bin/bash

I used 150 seconds, but anything less than 300 should be sufficient.