Elastic Search query with should returning 10.000 results but nothing matches

253 Views Asked by At

So I have an index of about 60GB data and basically I want to make a query to retrieve 1 specific product based off its reference number.

here is my query:

GET myindex/_search
{
  "_source": [
    "product.ref",
    "product.urls.*",
    "product.i18ns.*.title",
    "product_sale_elements.quantity",
    "product_sale_elements.prices.*.price",
    "product_sale_elements.listen_price.*",
    "product.images.image_url",
    "product.image_count",
    "product.images.visible",
    "product.images.position"
  ],
  "size": "6",
  "from": "0",
  "query": {
    "function_score": {
      "functions": [
        {
          "field_value_factor": {
            "field": "product.sales_count",
            "missing": 0,
            "modifier": "log1p"
          }
        },
        {
          "field_value_factor": {
            "field": "product.image_count",
            "missing": 0,
            "modifier": "log1p"
          }
        },
        {
          "field_value_factor": {
            "field": "featureCount",
            "missing": 0,
            "modifier": "log1p"
          }
        }
      ],
      "query": {
        "bool": {
          "filter": [
            {
              "term": {
                "product.is_visible": true
              }
            }
          ],
          "should": [
            {
              "query_string": {
                "default_field": "product.ref",
                "query": "13141000",
                "boost": 2
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "by_categories": {
      "terms": {
        "field": "categories.i18ns.de_DE.title.raw",
        "size": 100
      }
    }
  }
}

My question therefore is, why does this query give me back 10k results whereas I just wanted the 1 single product with that reference number.

If I do:

GET my-index/_search
{
  "query": {
    "match": {
      "product.ref": "13141000"
    }
  }
}

it matches correctly. How is should different then a normal match query?

2

There are 2 best solutions below

0
On

If you have must or filter clauses, as you do, then anything than matches must or filter does not have to match your should clause, since it's considered "optional"

You can either move query_string within your should clause to filter or set minimum_should_match to 1 like this

...
"should": [
  {
    "query_string": {
      "default_field": "product.ref",
      "query": "13141000",
      "boost": 2
    }
  }
],
"minimum_should_match" : 1,
...
0
On

Must - The condition must match.

Should - If the condition matches, then it will improve the score in a non-filter context. (If minimum_should_match is not declared explicitly)

As you can see, must is similar to filter but also provides scoring. Filter will not be providing any scoring.

You can put this clause inside a new must clause:

{
              "query_string": {
                "default_field": "product.ref",
                "query": "13141000",
                "boost": 2
              }
            }

Boost will not effect scoring if you put the above inside the filter clause.

Read more about bool queries here