I'm working with name searching and, for some reason, when I query "sam" documents containing the query as a sub-string like "samara", "samir", or "samuel" are returned with seemingly equal weight.
Is this just a built-in feature of Solr to parse words containing your search term as a sub-string? Is there a way to apply greater weight to the exact query itself before then moving on to alternatives?
I already have two separate fieldTypes
to weight the original text more heavily than it's synonyms, but I couldn't figure out a way around this substring problem as it appears to be inherent to Solr.
Here is my fieldType definition:
<fieldType class="solr.TextField" name="fullTextName" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Any help would be really appreciated.
It is possible in Solr :
1) you define 2 field types : fullTextName and fullTextNameExact the difference between them will be the indexing time analysis, specifically you want the exact field type to not have the edge Ngram token filter.
2) You create 2 fields, one per each type
3) you define a request handler that uses the dismax query parser or edismax query parser.[1]
4) one request parameter to use is the "qf", this param allows you to express different fields to be involved in search, weighting them with different boosts. In your case you could use :
This will boost stronger exact match results but still allow for autocompletion.
[1] https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser