Node.js + Elasticsearch: analyzers are not working when search query applied

1k Views Asked by At

I am trying to build a text search using Elasticsearch it's the first time I am using it so, I might misunderstand many of the concepts.

The search works fine when I write the full word existing in any of the indexed fields but, what I am trying to do is for example when I type sam get the products of samsung for that I am using the tokenizer which breaks the term in many ones s sa sam sams etc. Note: I am using mongoosastic to work with Elasticsearch server. Here is the products model, I call it Item:

var ItemSchema = new mongoose.Schema({
    title: {type: String, es_indexed:true, es_analyzer: 'edge_nGram_analyzer'},
    price: Number,
    description: {type: String, es_indexed:true},
    picture: String,
    vendor: {type: String, es_indexed:true},
    vendorId: {type:String, es_indexed:true}
});

And here is the rest of the model code where I am trying to use the analyzer and the tokenizer:

    ItemSchema.plugin(mongoosastic, {
        hosts: [
        'localhost:9200'
        ]
    });

    var Item = mongoose.model('Item', ItemSchema);

    Item.createMapping({
"analysis" : {
    "filter": {
        "edgeNGram_filter": {
           "type": "edgeNGram",
           "min_gram": 2,
           "max_gram": 20,
           "side" : "front"
        }
     },
    "analyzer":{
        "edge_nGram_analyzer": {
            "type":"custom",
            "tokenizer":"edge_ngram_tokenizer",
            "filter": [
              "lowercase",
              "asciifolding",
              "edgeNGram_filter"
            ]
        },
        "whitespace_analyzer": {
            "type": "custom",
            "tokenizer": "whitespace",
            "filter": [
              "lowercase",
              "asciifolding"
           ]    
        }
    },
    "tokenizer" : {
        "edge_ngram_tokenizer" : {
          "type" : "edgeNGram",
          "min_gram" : "2",
          "max_gram" : "5",
          "token_chars": [ "letter", "digit" ]
        }   
    }
  }
    },function(err, mapping){
      // do neat things here
      if(err) {
        console.log(err);
      } 
      console.log(mapping);
    });

    module.exports = Item;

I tested this with Item (product) has title : cupcake if I typed in the search box cup I got nothing but, if I write the full keyword I get the object.

Also I don't want to analyze the vendor ID and the description, I tried to do this: vendorId: {type:String, index: 'not_analyzed'} but, then the field stop being indexed for search.

The code of the search endPoint:

 app.post('/api/search', function(req, res, next) {
    Item.search({
      query_string: {
        query: req.body.keyword
      }
    },{hydrate:true}, function(err, results) {
      // results here
      res.send(results);
    });
 })
1

There are 1 best solutions below

3
On

You need to specify which analyzer to use for your title field. Right now, you're simply indexing each field for searching, but you're not applying the edge_nGram_analyzer to the title field. You can achieve it using the mongoosastic es_analyzer property, as shown below:

var ItemSchema = new mongoose.Schema({
    title: {type: String, es_indexed:true, es_analyzer: 'edge_nGram_analyzer'},
    price: Number,
    description: {type: String, es_indexed:true},
    picture: String,
    vendor: {type: String, es_indexed:true},
    vendorId: {type:String, es_indexed:true}
});

There is another issue in your code though, namely the edge_nGram_analyzer is not specified correctly, you should remove the content part and have it like this:

"analyzer":{
    "edge_nGram_analyzer": {
        "type":"custom",
        "tokenizer":"edge_ngram_tokenizer",
        "filter": [
           "lowercase",
           "asciifolding",
           "edgeNGram_filter"
        ]
     },
     ...