I have an instance of RDS MySQL running in private subnets, with a bastion host in public subnet. I also have Retool, which is using an SSH key (provided by retool) and some IP addresses in a MySQL connection. In Retool the set up of a mysql connection tests successfully and the application actually performs for the first many queries of a session. I can read and write from/to the database from retool with no problems, and I see corresponding activity on that database.
Seemingly randomly, at some point in a session, Retool seems to stop being able to connect to the RDS instance. I haven't been able to pin point the root cause of that. Essentially, retool SQL queries will just start taking ages to run, seemingly completely locked up. I run SHOW PROCESS LIST on MySQL instance and often there are no queries hitting the db at this point.
My instinct tells me that something is not right in the network configuration of the AWS build, though I don't entirely trust Retool.
Further info on the build: AWS Infrastructure is generated through Terraform. I have the latest AWS VPC module, with no Nat gateway. I have a bastion server running off an autoscaling group, with a t3 nano instance size. I can't see any high CPU or network usage on the monitoring. I've tested SSH from my local machine and it works and I've also tested with a larger instance. Nslookup and dig from the machine show the working dns lookups of the rds instance, though I'm a bit light on knowledge in this area so not sure what else I might need to check for. This is a recipe used before as well so not sure this is the right place to be looking.
The RDS instance is in private subnets and I have had no problems connecting to it using an AWS/OpenVPN vpn. When retool is playing up, I still have no problems locally. The RDS instance is pretty much vanilla, and I haven't toyed with the configuration so read isolation is default.
DNS routing is also in private host name.
The bastion is whitelisted in the security group of the RDS instance.
I don't have access to Retool support on this so am limited on how to debug from that side of things.
Any ideas on what I might be able to test to resolve, I've been going around in circles on this for weeks?