Previously, I asked this question about compound indexes on polymorphic foreign keys in ActiveRecord. The basis of my question was my understanding that indexes should be based on the cardinality of your column, and there's generally pretty low cardinality on Rails's STI type and polymorphic _type columns.
Accepting that the answer to my question is right -- that's there's value to indexing both the high cardinality _id columns and the low cardinality _type columns, because they together they have a high cardinality -- my next question is: how should you order your compound indexes?
An index of [owner_id, owner_type] places the field with higher cardinality first, while [owner_type, owner_id] places the field with higher cardinality second. Is a query using the former key more performant than a query using the latter key, or are they equally performant?
I ask because this has particular bearing on how I would order the compound keys for tables serving STI models. STI Rails finders almost always query on the type column -- which again is a column of generally low cardinality. The type column is therefore queried much more often than other indexes. If the type column is queried much more often, then maybe it makes sense to use the type-leading index, because less specific queries could take advantage of the first part of the index yielding a performance-boost. However, I wouldn't smaller perk to come at the detriment of performance to highly-specific queries. that take advantage of the higher-cardinality portion of the index.
From my own research (but I'm no expert DBA) I've learned that there's two thing to consider when deciding the order of a compound key index.
First, concerning the cardinality of columns, index generally are better at searching columns with high cardinality. So I would be inclined to place the column with the highest cardinality first in the index. For reference, there's an article titled MySQL Query Optimization that says:
In your case, the
_id
columns would clearly fit better that definition, thus they're a better candidate for being a prefix of the key.Another thing to consider would be the reusability of these indexes. Most (if not all) database systems allow a prefix of a compound key to be reused. For example, a compound key on
(owner_id, owner_type)
could also be used by queries onowner_id
but not onowner_type
.So from what you explained in your question you might be better off with two indexes: a compound key index on
(owner_id, owner_type)
and a another on(owner_type)
.Finally, it really all comes down to your dataset and queries. Try out multiple scenarios, benchmarks using different compound key ordering to see what is the most optimal solution. Also, don't forget that indexes incur a write penalty on your tables.
Update: There's also another rather popular SO question about compound key index there:
When should I use a composite index?