SQL: Inner join which its WHERE clause contains CHARINDEX() - Improve performence

1k Views Asked by At

I'm have two tables: A and B, each of them contains strings. I need to find all the matches between table A and table B when the string of table A contains the string of table B (B.stringColumn is substring of A.stringColumn).

I did it by the following code using CHARINDEX() function:

SELECT *
FROM A, B
WHERE CHARINDEX(B.stringColumn, A.stringColumn) > 0 

Is there more efficient way to do so?

This tables may contain massive amount of data, that why i'm asking this question.

Thanks in advance, Nuriel

2

There are 2 best solutions below

2
On

I would be more inclined to phrase this as:

SELECT *
FROM A JOIN
     B
     ON A.stringColumn LIKE '%' + B.stringColumn + '%';

Alas, that does not help performance at all. The problem is that looking for patterns in the middle of a string cannot make use of optimizations or any other techniques. You are stuck with nested loop join algorithms.

In some cases, you might be able to use a full text index. This can be a bit challenging when joining between two tables.

There is a technical solution to this problem. It involves indexes built on something called n-grams (say, 3-character combinations). However, SQL Server does not support this indexing type.

0
On

Hm, you can use LIKE operator. This will not maybe give you some significant performance upgrade, but it has a chance of using index (if there is an index on that column), while with functions in where clause, indexes will never be used.