I am trying to figure out the correct configuration for my analyzer configuration in my Solr/Lucidworks setup.
The results that I am seeing in Solr analysis seem to indicate that I should be getting matches, but when I do the Solr query (native or in the Lucidworks UI), no results are returned.
The relevant fragments from schema are:
<field name="content" indexed="true" multiValued="false" required="false" stored="true" type="dlowe_text_en"/>
<dynamicField indexed="true" name="*_txt_en_dlowe_split_tight" stored="true" type="dlowe_text_en"/>
<fieldType autoGeneratePhraseQueries="true" class="solr.TextField" name="dlowe_text_en" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
I have indexed some content that contains the string:
Administrator's Guide
Now, when I use the Solr analysis, this is the results that I get:
My understanding is if any the results are highlighted, this represents a match, but when I do the search in Solr on "Administrator" no results are found:
If I search on:
Administrator's
I do get the expected result.
I'm I totally miss understanding of how the analysis tool should work?
What I am trying to achieve is a search index that support a lot of technical items, that will only match on exact values. For example:
- V-123-1231-1231
- WILL_NOT_CHANGE
- /mnt/abc/Drivers/
- 4040:5050
So the WhitespaceTokenizer seems to make the most sense, but I also need stemming on the non-technical strings which would be indicated by periods (.), dashes (-), underlines (_), slashes (\ or /), etc.
Any insight / suggestions would be greatly appreciated.
Based upon further investigation and bumping up the latest version of Solr (8.7) verses the very old corp. version that we are using (6.4.2).
Plus the re-enforcement from Abhijit above, I found out that the "full record" search of Solr doesn't work the way that I would expected.
Instead, I needed to:
Once I did that, I started getting the results that I expected.
Probably obvious for those that use solr/lucene on a regular basis, but wasn't clear to me. Switching to 8.7 which doesn't have a 'default field', let me down the path to this solution.
Hopefully this will be of help to others in the future.