Elastic Search full string match not working

1.5k Views Asked by At

I am using Elastic builder npm

Using esb.termQuery(Email, "test")

Mapping:

"CompanyName": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }

Database fields:

"Email": "[email protected]",
"CompanyName": "my company"

Query JSON: { term: { CompanyName: 'my' } }. or { term: { Email: 'test' } } Result :

"Email": "[email protected]",
    "CompanyName": "my company"

Expectation: No result, need a full-text match, Match here is acting like 'like' or queryStringQuery.

I have 3 filters prefix, exact match, include.

1

There are 1 best solutions below

11
On BEST ANSWER

The standard analyzer is the default analyzer which is used if none is specified. It provides grammar based tokenization

In your example, maybe that you are not specifying any analyzer explicitly in the index mapping, therefore text fields are analyzed by default and the standard analyzer is the default analyzer for them. Refer this SO answer, to get a detailed explanation on this.

The following tokens are generated if no analyzer is defined.

POST/_analyze 

{
  "analyzer" : "standard",
  "text" : "[email protected]"
}

Tokens are:

{
  "tokens": [
    {
      "token": "test",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "mycompany.com",
      "start_offset": 5,
      "end_offset": 18,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

If you want a full-text search then you can define a custom analyzer with a lowercase filter, lowercase filter will ensure that all the letters are changed to lowercase before indexing the document and searching.

The normalizer property of keyword fields is similar to analyzer except that it guarantees that the analysis chain produces a single token.

The uax_url_email tokenizer is like the standard tokenizer except that it recognises URLs and email addresses as single tokens.

Index Mapping:

{
  "settings": {
    "analysis": {
      "normalizer": {
        "my_normalizer": {
          "type": "custom",
          "filter": [
            "lowercase"
          ]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "uax_url_email"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "CompanyName": {
        "type": "keyword",
        "normalizer": "my_normalizer"
      },
      "Email": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}

Index Data:

{
  "Email": "[email protected]",
  "CompanyName": "my company"
}

Search Query:

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "CompanyName": "My Company"
          }
        },
        {
          "match": {
            "Email": "test"
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

Search Result:

"hits": [
      {
        "_index": "stof_64220291",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "Email": "[email protected]",
          "CompanyName": "my company"
        }
      }
    ]