Sub-query 1:
SELECT * from big_table
where category = 'fruits' and name = 'apple'
order by yyyymmdd desc
Explain:
table | key | extra
big_table | name_yyyymmdd | using where
Looks great!
Sub-query 2:
SELECT * from big_table
where category = 'fruits' and (taste = 'sweet' or wildcard = '*')
order by yyyymmdd desc
Explain:
table | key | extra
big_table | category_yyyymmdd | using where
Looks great!
Now if I combine those with UNION:
SELECT * from big_table
where category = 'fruits' and name = 'apple'
UNION
SELECT * from big_table
where category = 'fruits' and (taste = 'sweet' or wildcard = '*')
Order by yyyymmdd desc
Explain:
table | key | extra
big_table | name | using index condition, using where
big_table | category | using index condition
UNION RESULT| NULL | using temporary; using filesort
Not so good, it uses filesort.
This is a trimmed down version of a more complexed query, here are some facts about the big_table:
- big_table has 10M + rows
- There are 5 unique "category"s
- There are 5 unique "taste"s
- There are about 10,000 unique "name"s
- There are about 10,000 unique "yyyymmdd"s
- I have created single index on each of those fields, plus composite idx such as
yyyymmdd_category_taste_namebut Mysql is not using it.
And have
INDEX(catgory)or some index starting withcategory. However, if more than about 20% of the table iscategory = 'fruits'will probably decide to ignore the index and simply do a table scan. (Since you say there are only 5 categories, I suspect the optimizer will rightly eschew the index.)Or this might be beneficial:
INDEX(category, yyyymmdd), in this order.The
UNIONhad to do a sort (either in memory on disk, it is not clear) because it was unable to fetch the rows in the desired order.A composite index
INDEX(yyyymmdd, ...)might be used to avoid the 'filesort', but it won't use any columns afteryyyymmdd.When constructing a composite index, start with any
WHEREcolumns compared '='. After that you can add one range orgroup byororder by. More details.UNIONis often a good choice for avoiding a slowOR, but in this case it would need three indexesand adding yyyymmdd would not help unless you add a
LIMIT.And the query would be:
Adding a limit would be even messier. First tack
yyyymmddon the end of each of the three composite indexes, thenAdding an OFFSET would be even worse.
Two other techniques -- "covering" index and "lazy lookup" might help, but I doubt it.
Yet another technique is to put all the words in the same column and use a
FULLTEXTindex. But this may be problematical for several reasons.