Since MySQL version 8.0.17 there is Multi-Valued Indexes for JSON array column, is there also a simple possibility to create Multi-Valued Indexes for comma delimited string column (CHAR, VARCHAR, ...) to increase the performance of FIND_IN_SET function?
I know that there is an option to create a secondary table customers_zip and use the INNER JOIN or IN function, but this is a complicated option. I use 10k databases whit ~150 various tables -> ~1.5M different comma delimited columns.
Example table:
CREATE TABLE `customers` (
`id` BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
`zip_string` VARCHAR(100) NOT NULL,
`zip_json` JSON,
INDEX `zip_json` ((CAST(`zip_json` AS UNSIGNED ARRAY)))
);
Insert 1 000 000 rows:
INSERT INTO `customers` (`id`, `zip_string`, `zip_json`) VALUES
(1, '10001,10002,10003', '[10001,10002,10003]'),
(2, '10004,10005', '[10004,10005]'),
...;
Select for FIND_IN_SET load 120ms
SELECT * FROM `customers` WHERE FIND_IN_SET('10005', `zip_string`);
Select for JSON_CONTAINS or MEMBER OF load 1ms
SELECT * FROM `customers` WHERE JSON_CONTAINS(`zip_json`, '10005');
SELECT * FROM `customers` WHERE 10005 MEMBER OF(`zip_json`);
No, you can't make any kind of index to optimize searches for arbitrary elements in a comma-separated list.
The closest options are:
Convert the comma-separated list to a JSON array, and make a multi-valued index, however in current versions of MySQL this only works for integers. There is no support for making a multi-valued index of strings.
Use a functional index, but this only works for a fixed search criteria. That is, you could add index on the expression
(FIND_IN_SET('10005', zip_string)), but the value 10005 is hard-coded in the index definition. It's not useful to find any other value.Use a fulltext index, but this has limitations too.
It really is easier to optimize if you refactor to a normalized table design, storing individual values in their own row. Storing comma-separated lists working against the strengths of a relational database.