I have a large table (About 10 million rows) that I need to delete records that are "older" than 10 days (according to created_at column). I have a python script that I run to do this. created_at is a varchar(255) and has values like for e.g. 1594267202000
import mysql.connector
import sys
from mysql.connector import Error
table = sys.argv[1]
deleteDays = sys.argv[2]
sql_select_query = """SELECT COUNT(*) FROM {} WHERE created_at / 1000 < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL %s DAY))""".format(table)
sql_delete_query = """DELETE FROM {} WHERE created_at / 1000 < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL %s DAY)) LIMIT 100""".format(table)
try:
connection = mysql.connector.connect(host=localhost,
database=myDatabase,
user=admin123,
password=password123)
cursor = connection.cursor()
#initial count of rows before deletion
cursor.execute(sql_select_query, (deleteDays,))
records = cursor.fetchone()[0]
while records >= 1:
# stuck at following below line and time out happens....
cursor.execute(sql_delete_query, (deleteDays,))
connection.commit()
cursor.execute(sql_select_query, (deleteDays,))
records = cursor.fetchone()[0]
#final count of rows after deletion
cursor.execute(sql_select_query, (deleteDays,))
records = cursor.fetchone()[0]
if records == 0:
print("\nRows deleted")
else:
print("\nRows NOT deleted")
except mysql.connector.Error as error:
print("Failed to delete: {}".format(error))
finally:
if (connection.is_connected()):
cursor.close()
connection.close()
print("MySQL connection is closed")
When I run this script and it runs the DELETE QUERY however... it fails due to:
Failed to delete: 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
I know that the innodb_lock_wait_timeout is currently set to 50 seconds and I can increase it to overcome this problem, however i'd rather not touch the timeout and.... I want to basically delete in chunks maybe? Anyone know how I can do it here using my code as example?
One approach here might be to use a delete limit query, to batch your deletes at a certain size. Assuming batches of 100 records:
Note that strictly speaking you should always have an
ORDER BY
clause when usingLIMIT
. What I wrote above might delete any 100 records matching the criteria for deletion.