Use wordforms or regexp in Sphinx to force a mutilword term to be a "word"

258 Views Asked by At

Is there a way to "force" Sphinx to index a term such as e.g. iphone 5 into a single-term? For various reasons I can't search for it as "iphone 5" or iphone near\1 5 I need to search for it as iphone 5. Naturally the way Sphinx works this means that it searches for both iphone and 5 anywhere in the document when I want it to search for the exact term iphone 5. Can I somehow index iphone 5 into a single-term to make this happen.

I still need to be able to apply wordforms/regexp and other mapping to the term e.g.

iphone 5>iphone5

This way if someone searches on iphone5 it will find iphone 5 and vice-versa. The issue is if I a search is done on iphone 5 while it will find iphone5 it will also find Selling 5 iphone 6Gs as well whereas if I search on "iphone 5" it will not find iphone5. So my goal is to make iphone 5 into a term that does not require "" to be treated as a phrase without being forced to search as an exact phrase which will break any additional wordform/regexp matching.

1

There are 1 best solutions below

1
On

Do you control the configuration of the index? If so you can configure the index to be created with the index_exact_words option.

From the documentation (http://sphinxsearch.com/docs/current.html#conf-index-exact-words) :

12.2.42. index_exact_words

Whether to index the original keywords along with the stemmed/remapped versions. Optional, default is 0 (do not index). Introduced in version 0.9.9-rc1.

When enabled, index_exact_words forces indexer to put the raw keywords in the index along with the stemmed versions. That, in turn, enables exact form operator in the query language to work. This impacts the index size and the indexing time. However, searching performance is not impacted at all.

Example:

index_exact_words = 1 `