Solr search not returing documents

71 Views Asked by At

I am trying to implement PorterStemFilterFactory in my analyzer during indexing .But when i query for documents,the output dont have documents which I got before adding the above analyzer.How can I get documents with both stemming and normal filters.


<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
     <analyzer type="index">
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^a-zA-Z0-9]" replacement=" "/>  
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
 <filter class="solr.PorterStemFilterFactory"/>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>

when I search for query "agile" with below analyzer,it returned documents where the query were found.

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
     <analyzer type="index">
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^a-zA-Z0-9]" replacement=" "/>  
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>

Thanks in Advance


There are 2 best solutions below


So the PorterStemFilterFactory removes common endings from words.

In your case the word agile is reduced to agil.

You can check here (search here for the word agile).

Now search here for the corresponding output after applying Porter Stemming.

You will see you cant find the word agile , because it is stemmed to agil.

That is why you are not able to search for agile, since there is no document that exists with that word . try searching for agil and you should see the results.


Using "solr.PorterStemFilterFactory" will generate token as agil

I suggest you to use

<filter class="solr.EnglishMinimalStemFilterFactory"/>

post filter agile will be same agile

use filters as per your requirements.