I have the following MySql table (only 845 rows):
CREATE TABLE `categories_nested_set` (
`lft` int(11) NOT NULL DEFAULT '0',
`rgt` int(11) DEFAULT NULL,
`id` int(11) DEFAULT NULL,
`category` varchar(128) DEFAULT NULL,
PRIMARY KEY (`lft`),
UNIQUE KEY `id` (`id`),
UNIQUE KEY `rgt` (`rgt`),
KEY `idx_lftrgtid` (`id`,`lft`,`rgt`),
KEY `idx_lft` (`lft`),
KEY `i1` (`lft`) USING BTREE,
KEY `i2` (`rgt`) USING BTREE,
KEY `i3` (`id`) USING BTREE,
CONSTRAINT `fk_categories_nested_set_id_category` FOREIGN KEY (`id`) REFERENCES `categories` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
(you can see I have a whole lot of indexes in there, just in case).
I perform the following self-join query:
SELECT *
FROM categories_nested_set AS H
LEFT JOIN categories_nested_set AS I ON (H.lft > I.lft)
Which generates the following EXPLAIN:
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,H,ALL,NULL,NULL,NULL,NULL,845,NULL
1,SIMPLE,I,ALL,"PRIMARY,idx_lft,i1",NULL,NULL,NULL,845,"Range checked for each record (index map: 0x31)"
The EXPLAIN would suggest that MySql is choosing not to use an index, and I cannot understand why. The table definition shows that all the relevant columns are indexed.
In the scope of a far-larger query (5 million rows, 14x tables) this piece is proving to be a massive bottleneck. Any advice would be appreciated.
Thanks,
I think you should use this query :
The idea is not to use JOIN because it forces MySql to construct the result by matching one row at a time.
In the solution I propose, the cross-product table is constructed without a join, so it enables InnoDB to fetch rows from the first table (H) independently of the other table, hence allowing it to batch rows. The index on the second table can then be used by MySQL because no link is made from the H table.
On my computer, the proposed solution is about 5 times faster, with a table of 1000 records.
Here is the result of the explain :
Note that you may also improve index use of your request by restricting the columns you are retrieving (although that is not always the case).