How can I index SQL Server string values for optimized wildcard searching from the left?

210 Views Asked by At

I have a string "key" which I need to store in a SQL Server database column. This key is a dot-delimited list of tokens, of any size. Examples:

keanu
2016.sucked
foo.bar.baz
the.cake.is.a.lie

I will need to search for rows by any collection of tokens from the left. So, I might want to find all tokens that start:

foo
foo.bar
foo.bar.baz

(Obviously, that last one is the entire token).

The only way I know how to do this in SQL is with a LIKE operator:

SELECT * FROM [table] WHERE key LIKE 'foo.%'

How bad is this? LIKE has a reputation for performance issues, but since I'm always searching from the left end of the string and leaving the right end open-ended -- does that help?

I had a fleeting idea to do something like this:

| key         | base1 | base2    | base3       |
|-------------|-------|----------|-------------|
| foo.bar.baz | foo.  | foo.bar. | foo.bar.baz |

Obviously, I'd have to have N columns for baseX, which is awful, but there's no doubt it would be fast.

Assuming I use a simple VARCHAR field, is there a way to index this field to make it acceptable when I get a million rows of data in here?

1

There are 1 best solutions below

0
On BEST ANSWER

No it's not bad, since you don't have the wildcard at the beginning of search string which would make the query non-sargable. It has to scan through all the pages to find the result since the first character in key value itself is not known.

But in your case, The key value intermediate nodes can be found from the root of the B-tree since we know the starting characters. Creating a Non clustered Index on key column should help you. The below query is still sargable

SELECT * FROM [table] WHERE key LIKE 'foo.%'