Prioritizing original terms over synonyms in Elastic Search

231 Views Asked by At

In my analyzer pipeline I have a synonyms function.

Let's say that I have the following synonyms

beverage, drink

Now let's say that a user searches for 'beverage', the user will get documents that contain 'beverage' or 'drink' without any preference.

The thing is that I want to give a higher score to documents that contain the original search term ('beverage' in this case) and a lower score to its synonyms ('drink' in this case).

What's the best and cheapest way for doing that?

2

There are 2 best solutions below

1
On

You can create another subfield in the mapping without the synonyms.

In the query you match on both fields.

Index

PUT synonym_index
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "synonym": {
            "tokenizer": "standard",
            "filter": [
              "synonym"
            ]
          }
        },
        "filter": {
          "synonym": {
            "type": "synonym",
            "synonyms": [
              "beverage, drink"
            ]
          }
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "search_analyzer": "synonym",
        "fields": {
          "no_synonyms": {
            "type": "text"
          }
        }
      }
    }
  }
}

Documents

POST synonym_index/_doc/1
{
  "name": "beverage"
}

POST synonym_index/_doc/2
{
  "name": "drink"
}

Query

GET synonym_index/_search
{
  "query": {
    "multi_match": {
      "query": "beverage",
      "type": "most_fields",
      "fields": [
        "name",
        "name.no_synonyms"
      ]
    }
  }
}
0
On

I think this will work if in the request query will be a priority for the field "name.no_synonyms" So, you need to use a multi_match query in your request and search for the two fields "name" and second "name.no_synonyms" with higher priority for "name.no_synonym"

Example of the query

{
  "query": {
    "multi_match": {
      "query": "drink",
      "fields": [
        "name.no_synonyms^10",
        "name"
      ]
    }
  }
}