I have two tables
- Export
Name Type Collation Attributes Null Default Extra
id int(10) utf8_unicode_ci UNSIGNED No None AUTO_INCREMENT
email varchar(150) utf8_unicode_ci No None
city_id int(11) utf8_unicode_ci Yes NULL
Indexes
Type Unique Packed Column Cardinality Collation Null
id BTREE Yes No id 769169 A No
email_index BTREE Yes No email 769169 A No
city_id_index BTREE No No city_id. 6356 A Yes
- Export history
Name Type Collation Attributes Null Default Extra
id int(10) utf8_unicode_ci UNSIGNED No None AUTO_INCREMENT
email varchar(255) utf8_unicode_ci No None
Indexes
Type Unique Packed Column Cardinality Collation Null
id BTREE Yes No id 113887 A No
email_index BTREE No No email 113887 A No
I need to get a top city ids which have the most emails (users). Also there is export_history table. I need to exclude emails from the results.
The end query looks like
Main query
SELECT COUNT(city_id) as city_count, city_id
FROM export e
WHERE NOT EXISTS (
SELECT * FROM export_history ehistory
WHERE e.email = ehistory.email
)
GROUP BY city_id
ORDER BY city_count DESC
LIMIT 5
Exec time is ~7 sec. The problem is that it takes about that much to execute.
Explain shows:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY e index NULL city_id_index 5 NULL 769169 Using where; Using temporary; Using filesort
2 DEPENDENT SUBQUERY ehistory ref email_index email_index 767 e.email 1 Using where; Using index
Please take into account that these two queries work very fast > 0.01 sec
Query 1
SELECT COUNT(city_id) as city_count, city_id
FROM export
GROUP BY city_id
ORDER BY city_count DESC
LIMIT 5
Exec time is ~0.1 sec
Query 2
SELECT *
FROM export e
WHERE NOT EXISTS (
SELECT * FROM export_history ehistory
WHERE e.email = ehistory.email
)
Exec time is ~0.02 sec
Could you please recommend any suggestions to improve performance of the main query?
You can simplify your query by using
LEFT JOIN ... IS NULL
rather thanNOT EXISTS
with your dependent subquery. It may (or may not: try it) speed things up for you by avoiding repetition of the dependent subquery.Try this compound index.
and if that doesn't help try an index with the columns in the opposite order:
Pro tip: Single-column indexes don't do the same thing as multi-column indexes created to match the filtering criteria in your queries.