I'm have two tables: A and B, each of them contains strings. I need to find all the matches between table A and table B when the string of table A contains the string of table B (B.stringColumn is substring of A.stringColumn).
I did it by the following code using CHARINDEX()
function:
SELECT *
FROM A, B
WHERE CHARINDEX(B.stringColumn, A.stringColumn) > 0
Is there more efficient way to do so?
This tables may contain massive amount of data, that why i'm asking this question.
Thanks in advance, Nuriel
I would be more inclined to phrase this as:
Alas, that does not help performance at all. The problem is that looking for patterns in the middle of a string cannot make use of optimizations or any other techniques. You are stuck with nested loop join algorithms.
In some cases, you might be able to use a full text index. This can be a bit challenging when joining between two tables.
There is a technical solution to this problem. It involves indexes built on something called n-grams (say, 3-character combinations). However, SQL Server does not support this indexing type.