How do I clear an Aurora MySQL 5.6 Table Metadata Lock Wait when there are no locking transactions?

564 Views Asked by At

I'm currently experiencing a very strange issue in our development environment with our Aurora MySQL 5.6 DB running in RDS while trying to rename a table to perform a table swap. My SQL statement is

rename table
    oems.orderInformation to oems.orderInformation_bak,
    oems.orderInformation_tmp to oems.orderInformation;

While playing with this query, it results in inconsistent timing. It has taken anywhere from 9 seconds to over 30 minutes to complete. This will eventually be dealing with production data and the table becomes locked once I start the query, so I need it to take as little time as possible (< 5 seconds).

I have done this in the past with much larger tables on the same RDS instance and had no problems.

When I examine the processlist, I see no indication that any process is locking the table. On one of my trials, I ran the following query twice; once while the rename statement was running (and nearly completed), and once immediately after the rename completed:

SELECT ID, TIME, USER, HOST, DB, COMMAND, STATE, INFO
    FROM INFORMATION_SCHEMA.PROCESSLIST WHERE DB IS NOT NULL
    AND (`INFO` NOT LIKE '%INFORMATION_SCHEMA%' OR INFO IS NULL)
AND TIME > (select TIME from information_schema.PROCESSLIST where id = 1319084)
    ORDER BY `TIME`

I put the results in a spreadsheet and applied some filters to come to the conclusion that all process IDs that were running before the query completed were also running after the query completed. I was hoping to use this information to find the process that was causing the issue, but based on these results it appears that the query didn't become free to run due to another process completing.

It's also worth mentioning that many times when I looked at select * from information_schema.innodb_trx while the query was running there were no other transactions running on the DB instance.

Here is the dump of the show engine innodb status during a trial (about 15 minutes into the query execution). I've scrubbed out all proprietery information and replaced it with dummy text, as well as deleted several hundred transactions that were redundant and getting in the way of the character count. I replaced my (developer) user name with my_user_name. The only other application I can think of that uses this table has the username oem_order_svr:


=====================================
2022-04-28 21:19:17 2b2597f8d700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 10 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 7417656 srv_active, 0 srv_shutdown, 6080109 srv_idle
srv_master_thread log flush and writes: 0
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 435052730
OS WAIT ARRAY INFO: signal count 41957504
Mutex spin waits 504050260, rounds 1755463360, OS waits 47447551
RW-shared spins 411189816, rounds 12364551254, OS waits 384637039
RW-excl spins 383194, rounds 89818653, OS waits 2941602
Spin rounds per wait: 3.48 mutex, 30.07 RW-shared, 234.39 RW-excl
------------------------
LATEST FOREIGN KEY ERROR
------------------------
2022-04-12 11:35:59 2b25bca48700 Transaction:
TRANSACTION 13786221095, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
7 lock struct(s), heap size 376, 5 row lock(s), undo log entries 1
MySQL thread id 882407, OS thread handle 0x2b25bca48700, query id 95050248 172.31.60.13 my_user_name updating
DELETE FROM `auth`.`roles` WHERE (`id` = '3')
Foreign key constraint fails for table `auth`.`user_role`:
,
  CONSTRAINT `user_role_ibfk_1` FOREIGN KEY (`role_id`) REFERENCES `roles` (`id`)
Trying to delete or update in parent table, in index `PRIMARY` tuple:
DATA TUPLE: 5 fields;
 0: len=4; bufptr=0x2b24e0d25380; hex= 80000003; asc     ;;
 1: len=6; bufptr=0x2b24e0d25384; hex= 000335b90a27; asc   5  ';;
 2: len=7; bufptr=0x2b24e0d2538a; hex= 5a00de000312ae; asc Z      ;;
 3: len=18; bufptr=0x2b24e0d25391; hex= 4c494e4b5f4649524d574152455f42455441; asc LINK_FIRMWARE_BETA;;
 4: len=39; bufptr=0x2b24e0d253a3; hex= 526f6c6520666f7220626574612072656c6561736573206f66206c696e6b206669726d77617265; asc <private_text>;;

But in child table `auth`.`user_role`, in index `role_id`, there is a record:
PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len=4; bufptr=0x2b23c81b42a8; hex= 80000003; asc     ;;
 1: len=30; bufptr=0x2b23c81b42ac; hex= 353265326537373037663432313165363963313861376163633030386139; asc 52e2e7707f4211e69c18a7acc008a9; (total 32 bytes);

------------------------
LATEST DETECTED DEADLOCK
------------------------
2022-04-28 14:11:23 2b24dc67e700
*** (1) TRANSACTION:
TRANSACTION 13793397046, ACTIVE 10 sec inserting
mysql tables in use 2, locked 2
LOCK WAIT 9 lock struct(s), heap size 376, 5 row lock(s), undo log entries 2
MySQL thread id 1312355, OS thread handle 0x2b2644ac9700, query id 105725111 172.31.179.183 application_user_3 Sending data
INSERT INTO unrelated_table_1
            SELECT * FROM unrelated_table_2 <more_query_info>
                
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 36796 page no 94183 n bits 1 index `private_idx_1` of table unrelated_table trx id 13793397046 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len=8; bufptr=0x2b23929e8070; hex= 73757072656d756d; asc supremum;;

*** (2) TRANSACTION:
TRANSACTION 13793397132, ACTIVE 4 sec inserting
mysql tables in use 2, locked 2
LOCK WAIT 9 lock struct(s), heap size 376, 5 row lock(s), undo log entries 2
MySQL thread id 1312360, OS thread handle 0x2b2644ed9700, query id 105725156 172.31.129.86 application_user_3 Sending data
INSERT INTO unrelated_table_1
            SELECT * FROM unrelated_table_2 <more_query_info>
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 36796 page no 94183 n bits 1 index `private_idx_1` of table unrelated_table trx id 13793397132 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len=8; bufptr=0x2b23929e8070; hex= 73757072656d756d; asc supremum;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 36796 page no 94183 n bits 1 index `private_idx_1` of table unrelated_table trx id 13793397132 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len=8; bufptr=0x2b23929e8070; hex= 73757072656d756d; asc supremum;;

*** WE ROLL BACK TRANSACTION (2)
------------
TRANSACTIONS
------------
Trx id counter 13793663781
Purge done for trx's n:o < 13793646510 undo n:o < 0 state: running but idle
History list length 5671
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 1318788, OS thread handle 0x2b2597f8d700, query id 105910463 172.31.60.13 my_user_name System lock
/* ApplicationName=DataGrip 2020.3 */ show engine innodb status
---TRANSACTION 13793659319, not started
MySQL thread id 1320528, OS thread handle 0xffffffffffffffff, query id 105904598 172.31.60.13 my_user_name cleaned up
---TRANSACTION 13793663024, not started
MySQL thread id 1319549, OS thread handle 0xffffffffffffffff, query id 105909526 172.31.131.230 application_user_1 cleaned up
---TRANSACTION 0, not started
MySQL thread id 1317938, OS thread handle 0xffffffffffffffff, query id 105894722 172.31.60.13 my_user_name cleaned up
---TRANSACTION 13793635356, not started
MySQL thread id 1317939, OS thread handle 0xffffffffffffffff, query id 105894723 172.31.60.13 my_user_name cleaned up
---TRANSACTION 13793663018, not started
MySQL thread id 1319126, OS thread handle 0xffffffffffffffff, query id 105909514 172.31.129.86 application_user_2 cleaned up
---TRANSACTION 13793639691, not started
MySQL thread id 1319084, OS thread handle 0x2b25c6144700, query id 105904967 172.31.60.13 my_user_name Waiting for table metadata lock
/* ApplicationName=DataGrip 2020.3 */ rename table
    oems.orderInformation to oems.orderInformation_bak,
    oems.orderInformation_tmp to oems.orderInformation
---TRANSACTION 13793658530, not started
MySQL thread id 1315364, OS thread handle 0xffffffffffffffff, query id 105903462 172.31.24.3 subapisrv cleaned up
---TRANSACTION 13793437058, not started
MySQL thread id 1303954, OS thread handle 0xffffffffffffffff, query id 105843081 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793410121, not started
MySQL thread id 1312188, OS thread handle 0xffffffffffffffff, query id 105742790 172.31.24.3 subeng cleaned up
---TRANSACTION 13793568834, not started
MySQL thread id 1311972, OS thread handle 0xffffffffffffffff, query id 105831566 172.31.35.142 elveng cleaned up
---TRANSACTION 13793650303, not started
MySQL thread id 1303937, OS thread handle 0xffffffffffffffff, query id 105891948 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793649304, not started
MySQL thread id 1303971, OS thread handle 0xffffffffffffffff, query id 105889106 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793460245, not started
MySQL thread id 1303941, OS thread handle 0xffffffffffffffff, query id 105879077 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793461640, not started
MySQL thread id 1303942, OS thread handle 0xffffffffffffffff, query id 105889585 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793641619, not started
MySQL thread id 1303966, OS thread handle 0xffffffffffffffff, query id 105878574 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793641612, not started
MySQL thread id 1303970, OS thread handle 0xffffffffffffffff, query id 105878568 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793641547, not started
MySQL thread id 1303969, OS thread handle 0xffffffffffffffff, query id 105878486 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793641516, not started
MySQL thread id 1303968, OS thread handle 0xffffffffffffffff, query id 105878437 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793641542, not started
MySQL thread id 1303967, OS thread handle 0xffffffffffffffff, query id 105878483 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793641507, not started
MySQL thread id 1303958, OS thread handle 0xffffffffffffffff, query id 105878379 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793658538, not started
MySQL thread id 1309351, OS thread handle 0xffffffffffffffff, query id 105903466 172.31.24.3 subapisrv cleaned up
---TRANSACTION 13793661965, not started
MySQL thread id 1309352, OS thread handle 0xffffffffffffffff, query id 105908118 172.31.48.91 logsapisrv cleaned up
---TRANSACTION 13793648028, not started
MySQL thread id 1303964, OS thread handle 0xffffffffffffffff, query id 105887378 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793641517, not started
MySQL thread id 1303963, OS thread handle 0xffffffffffffffff, query id 105878439 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793658555, not started
MySQL thread id 1309353, OS thread handle 0xffffffffffffffff, query id 105903560 172.31.24.3 subapisrv cleaned up
---TRANSACTION 13793641505, not started
MySQL thread id 1303962, OS thread handle 0xffffffffffffffff, query id 105878377 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793460243, not started
MySQL thread id 1303961, OS thread handle 0xffffffffffffffff, query id 105879071 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793460223, not started
MySQL thread id 1303959, OS thread handle 0xffffffffffffffff, query id 105879065 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793641524, not started
MySQL thread id 1303965, OS thread handle 0xffffffffffffffff, query id 105878447 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793417106, not started
MySQL thread id 1303960, OS thread handle 0xffffffffffffffff, query id 105845672 172.31.18.131 application_user_2 cleaned up
---TRANSACTION 13793417088, not started
MySQL thread id 1287336, OS thread handle 0xffffffffffffffff, query id 105890542 172.31.59.136 application_user_2 cleaned up
---TRANSACTION 13793433855, not started
MySQL thread id 1284140, OS thread handle 0xffffffffffffffff, query id 105777662 172.31.35.67 application_user_4 cleaned up
---TRANSACTION 13793644843, not started
MySQL thread id 1281304, OS thread handle 0xffffffffffffffff, query id 105909207 172.31.60.13 my_user_name cleaned up
---TRANSACTION 13793636788, not started
MySQL thread id 1281302, OS thread handle 0xffffffffffffffff, query id 105909208 172.31.60.13 my_user_name cleaned up
---TRANSACTION 13793649655, not started
MySQL thread id 1279757, OS thread handle 0xffffffffffffffff, query id 105890034 172.31.23.96 application_user_6 cleaned up
---TRANSACTION 13793650312, not started
MySQL thread id 1274034, OS thread handle 0xffffffffffffffff, query id 105891977 172.31.59.136 application_user_2 cleaned up
---TRANSACTION 13793649667, not started
MySQL thread id 1274033, OS thread handle 0xffffffffffffffff, query id 105890048 172.31.59.136 application_user_2 cleaned up
---TRANSACTION 13793641628, not started
MySQL thread id 1274032, OS thread handle 0xffffffffffffffff, query id 105878583 172.31.59.136 application_user_2 cleaned up
---TRANSACTION 13793649727, not started
MySQL thread id 1239688, OS thread handle 0xffffffffffffffff, query id 105890223 172.31.59.136 application_user_2 cleaned up
---TRANSACTION 13793650735, not started
MySQL thread id 1238620, OS thread handle 0xffffffffffffffff, query id 105892710 172.31.62.245 application_user_6 cleaned up
---TRANSACTION 13793650715, not started
MySQL thread id 1238336, OS thread handle 0xffffffffffffffff, query id 105892644 172.31.62.245 application_user_6 cleaned up
---TRANSACTION 13793649658, not started
MySQL thread id 1219080, OS thread handle 0xffffffffffffffff, query id 105890040 172.31.23.96 application_user_6 cleaned up
---TRANSACTION 13793650685, not started
MySQL thread id 1218609, OS thread handle 0xffffffffffffffff, query id 105892528 172.31.59.136 application_user_2 cleaned up
---TRANSACTION 13793663601, not started
MySQL thread id 820491, OS thread handle 0xffffffffffffffff, query id 105910449 localhost rdsadmin cleaned up
---TRANSACTION 13793662719, not started
MySQL thread id 820503, OS thread handle 0xffffffffffffffff, query id 105909125 localhost rdsadmin cleaned up
---TRANSACTION 13793663699, not started
MySQL thread id 820494, OS thread handle 0xffffffffffffffff, query id 105910437 localhost rdsadmin cleaned up
---TRANSACTION 13793663778, not started
MySQL thread id 820492, OS thread handle 0xffffffffffffffff, query id 105910461 localhost rdsadmin cleaned up
---TRANSACTION 13793652812, ACTIVE 2346 sec
MySQL thread id 1295998, OS thread handle 0xffffffffffffffff, query id 105895734 172.31.22.195 oem_order_svr cleaned up
Trx read view will not see trx with id >= 13793652813, sees < 13793649481
---TRANSACTION 13793649495, ACTIVE 2938 sec
MySQL thread id 1296000, OS thread handle 0xffffffffffffffff, query id 105889772 172.31.22.195 oem_order_svr cleaned up
Trx read view will not see trx with id >= 13793649496, sees < 13793649476
---TRANSACTION 13793649481, ACTIVE 2942 sec
MySQL thread id 1296002, OS thread handle 0xffffffffffffffff, query id 105889754 172.31.22.195 oem_order_svr cleaned up
Trx read view will not see trx with id >= 13793649482, sees < 13793646688
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (read thread)
I/O thread 4 state: waiting for i/o request (read thread)
I/O thread 5 state: waiting for i/o request (read thread)
I/O thread 6 state: waiting for i/o request (write thread)
I/O thread 7 state: waiting for i/o request (write thread)
I/O thread 8 state: waiting for i/o request (write thread)
I/O thread 9 state: waiting for i/o request (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
48 OS file reads, 48 OS file writes, 0 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 1998, seg size 2000, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Adaptive hash table 0: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 1: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 2: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 3: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 4: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 5: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 6: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 7: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 8: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 9: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 10: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 11: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 12: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 13: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 14: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 15: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 16: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 17: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 18: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 19: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 20: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 21: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 22: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 23: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 24: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 25: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 26: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 27: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 28: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 29: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 30: Hash table size 575987, node heap has 0 buffer(s)
Adaptive hash table 31: Hash table size 575987, node heap has 0 buffer(s)
0.00 hash searches/s, 1.20 non-hash searches/s
---
LOG
---
Log sequence number 8204
Log flushed up to   8204
Pages flushed up to 8204
Last checkpoint at  8192
0 pending log writes, 0 pending chkp writes
0 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 0; in additional pool allocated 0
Dictionary memory allocated 15096204
Buffer pool size   510306
Free buffers       2048
Database pages     508258
Old database pages 187578
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0 single page 0
Pages made young 47251392, not young 6213768874
0.00 youngs/s, 0.00 non-youngs/s
Pages read 177666037, created 376620, written 0
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 508258, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
----------------------
INDIVIDUAL BUFFER POOL INFO
----------------------
---BUFFER POOL 0
Buffer pool size   255153
Free buffers       1024
Database pages     254129
Old database pages 93789
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0 single page 0
Pages made young 23655341, not young 3118539304
0.00 youngs/s, 0.00 non-youngs/s
Pages read 89218229, created 313166, written 0
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 254129, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 1
Buffer pool size   255153
Free buffers       1024
Database pages     254129
Old database pages 93789
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0 single page 0
Pages made young 23596051, not young 3095229570
0.00 youngs/s, 0.00 non-youngs/s
Pages read 88447808, created 63454, written 0
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 254129, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Main thread process no. 28661, id 47437320816384, state: sleeping
Number of rows inserted 3276762, updated 33540556, deleted 2153606, read 56334872589
0.00 inserts/s, 2.80 updates/s, 0.00 deletes/s, 0.60 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

At this point I'm at a loss; it doesn't feel this metadata lock is caused by any process, transaction, or anything else I can think of. What else should I diagnose? Where should I go from here?

0

There are 0 best solutions below