How to handle noise words (NN) in Full text search

1.3k Views Asked by At

We are using Full-text search to check postcodes and addresses, we have found that NNx, where x is a number, is considered a noise word. We know the issue but can't think of a clean solution, besides checking if the search word contains NN and then using LIKE instead.

We are using this code.

SELECT  DISTINCT
    *
    FROM    CONTAINSTABLE(dbo.AddressData, Data, N'NN1', 1000) C
        INNER JOIN dbo.AddressData AD ON AD.Id = C.[KEY]
        INNER JOIN dbo.Address A ON A.Id = AD.AddressId

Does anyone have any ideas on how to escape or stop full text from seeing the NN as a noise word?

UPDATE: We added an if that checks for NN in the input string, and if NN is there we just use Like instead but this has a big effect on performance. went from less than a seconds to over 5 :/ any one have a better solution?

2

There are 2 best solutions below

0
On

This is a bug (?) in MS SQL server. "NN123" is how numbers are stored in the index. See this post at dba.stackexchange for more details:

https://dba.stackexchange.com/questions/316848/full-text-search-wont-find-post-codes-nnx-where-x-is-a-digit/316849#316849

0
On

You are using SQL server right?

From SQL Server 2005 noise words have been replaced by stopwords. You can edit the "stoplists" so that NNx is not a stopword anymore.

https://learn.microsoft.com/en-us/sql/relational-databases/search/configure-and-manage-stopwords-and-stoplists-for-full-text-search?view=sql-server-2017