Why does MySql 8 choose a different index after running for a while?

207 Views Asked by At

We're running a MySQL server (8.0.32) which has a table with a few million records. This table acts like a queue with a timestamp column of when the record is due for processing. Every day a few million records are added and deleted, the size roughly stays the same. The table looks like:

create table task (
  id bigint not null auto_increment,
  dueTs bigint not null,
  // other columns
)

This table also has an index on dueTs for fast lookup.

The application collects multiple records (about 100) at a time for parallel processing, any records in progress are excluded in the query like so:

SELECT * FROM task WHERE dueTs < UNIX_TIMESTAMP() AND id NOT IN (ids) LIMIT 100

This has worked fine for years until the database suddenly stopped using the correct index. It started using the primary key.

When we perform an OPTIMIZE TABLE task;, then the correct index is used again for a while. After about six hours the query becomes slow again because of the wrong key selection. We also tried to do ANALYSE TABLE task; but this didn't have the desired effect.

We can, of course, force the correct key in the query, but that doesn't explain why this keeps happening and why now.

What changed recently is that the database is purging old deleted data from another table which removed 500M+ records. I would not expect that cleaning another table could affect index selection on this table, but worth noting I think.

The java application which collects the records uses Hibernate. In this application we use a trick when there are no records to exclude. When the list is empty we add Long.MIN_VALUE to the list because Hibernate cannot handle an empty list. Does this affect cardinality?

Using the schema statistics, I've collected the cardinality of the primary key and the desired index using:

select INDEX_NAME, COLUMN_NAME, CARDINALITY 
FROM INFORMATION_SCHEMA.STATISTICS 
where TABLE_SCHEMA='myschema' and table_name='task';

This showed no changes in cardinality when the slow query started occurring. The number were:

"INDEX_NAME","COLUMN_NAME","CARDINALITY"
task_ix_dueTs,dueTs,131284
PRIMARY,id,47257372

Is there a way to know how and why MySQL chooses an index (other than explain)? Why would MySQL change its decision after a few hours?

1

There are 1 best solutions below

8
On

Change to

PRIMARY KEY(dueTs, id),  -- possibly faster lookup
INDEX(id)                -- to keep AUTO_INCREMENT happy

(And drop the current index on just dueTs.)

"A few million/day" = "a few dozen per second". Is it sometimes bursty? Does it sometimes get "behind"? How long does it take to process each task? How long for the "about 100"? (I ask these questions because "100" may need tuning.)

Add LIMIT 100 to your current SELECT; this should help performance when the processing gets backlogged.