I have a problem with solr spellcheck suggestions for multi word phrases. With the query for 'red chillies'
q=red+chillies&wt=xml&indent=true&spellcheck=true&spellcheck.extendedResults=true&spellcheck.collate=true
I get
<lst name="suggestions">
<lst name="chillies">
<int name="numFound">2</int>
<int name="startOffset">4</int>
<int name="endOffset">12</int>
<int name="origFreq">0</int>
<arr name="suggestion">
<lst><str name="word">chiller</str><int name="freq">4</int></lst>
<lst><str name="word">challis</str><int name="freq">2</int></lst>
</arr>
</lst>
<bool name="correctlySpelled">false</bool>
<str name="collation">red chiller</str>
</lst>
The problem is, even though 'chiller' has 4 results in index, 'red chiller' has none. So we end up suggesting a phrase with 0 result.
What can I do to make spellcheck work on the whole phrase only? I tried using KeywordTokenizerFactory in query:
<fieldType name="text_spell" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
And I also tried adding
<str name="sp.query.extendedResults">false</str>
within
<lst name="spellchecker">
in solrconfig.xml.
But neither seems to make a difference.
What would be the best way to make spellcheck only give collation that have results for the whole phrase? Thanks!
The real issue here is that you need to specify the
spellcheck.collateParam.q.op=AND
and also (optionally)spellcheck.collateParam.mm=100%
These params enforce the collate queries executed correctly.You can read more about this on the solr docs