I created a test table with 100k rows in it. The execution time for this query:
SELECT id FROM table
Is 0.032s
. If I add a GROUP BY
statement for an integer column that is indexed as Normal,BTREE
, the execution time for the query:
SELECT id FROM table GROUP BY indexedColumn
EXPLAIN output:
id | select_type | table | type | possible keys | key | key_len | ref | rows | Extra
1 SIMPLE [table] All [indexedColumnKey] null null null 105416 Using temporary; Using filesort
Is 0.073s
. The execution time doubled due to the GROUP BY
, but I'm assuming that's normal? The question I have, is why does adding LIMIT
to the query, like this:
SELECT id FROM table GROUP BY indexedColumn LIMIT 500
EXPLAIN output:
id | select_type | table | type | possible keys | key | key_len | ref | rows | Extra
1 SIMPLE [table] index [indexedColumnKey] [indexedColumnKey] 5 null 500 null
Increases the execution time to 0.301s
? That's a slowdown of over 4x.
I'm very inexperienced with SQL, so maybe this is totally normal, but it just seems counter-intuitive to me that limiting the number of rows returned slows the query down so much.
Questions:
- Is this normal?
- Is there any way to stop LIMIT from slowing down the query so much?
The query
SELECT id FROM table GROUP BY indexedColumn
does not use index. It is shown in your explain. But when you use limit the index is used. Also, for experiments you could disable cache using SQL_NO_CACHESELECT SQL_NO_CACHE ....