ElasticSearch query not returns exact match of an array

542 Views Asked by At

I have a question regarding the query of an array of Elasticsearch. in my case, the structure of the custom attributes is an array of objects, each contains inner_name and value, the type of value are mixed (could be string, number, array, Date..etc), and one of the types is multi-checkbox, where it should take an array as an input. Mapping the custom_attributes as below:

"attributes" : {
              "properties" : {
                "_id" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "inner_name" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "value" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },

Where I used mongoosastic to indexing my MongoDB to ES, so the structure of the custom attributes was like:

[
  {
    customer_name: "customerX",
    "custom_attributes" : [
      {
        "group" : "xyz",
        "attributes" : [
          {
            "inner_name" : "attr1",
            "value" : 123,
          },       
          {
            "inner_name" : "attr2",
            "value" : [
              "Val1",
              "Val2",
              "Val3",
              "Val4"
            ]
          }
        ]
      }
    ]
  },
  {
    customer_name: "customerY",
    "custom_attributes" : [
      {
        "group" : "xyz",
        "attributes" : [
          {
          "inner_name" : "attr2",
            "value" : [
              "Val1",
              "Val2"
            ]
          }
        ]
      }
    ]
  }
]

I want to perform a query where all values must be in the array. However, the problem with the below query is that it returns the document whenever it contains any of the values in the array. Here's the query:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "custom_attributes.attributes.inner_name": "attr2"
          }
        },
        {
          "terms": {
            "custom_attributes.attributes.value": [
              "val1",
              "val2",
              "val3",
              "val4"
            ]
          }
        }
      ]
    }
  }
}

For example, it returns both documents, where it should return just the first one only! what is wrong with my query? Is there another way to write the query?

1

There are 1 best solutions below

1
Alkis Kalogeris On

The elasticsearch terms query tries to match any if any of your values is present in a document, think of the operator being OR instead of AND which is the one you want. There are two solutions for this

  1. Use multiple term queries inside your bool must query, which will provide the needed AND functionality
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "custom_attributes.attributes.inner_name": "attr2"
          }
        },
        {
          "term": {
            "custom_attributes.attributes.value": "val1"
          }
        },
        {
          "term": {
            "custom_attributes.attributes.value": "val2"
          }
        },
        {
          "term": {
            "custom_attributes.attributes.value": "val3"
          }
        },
        {
          "term": {
            "custom_attributes.attributes.value": "val4"
          }
        }
      ]
    }
  }
}
  1. Use the match query with the operator AND and whitespace analyzer. This WILL NOT work if your terms contain a whitespace
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "custom_attributes.attributes.inner_name": "attr2"
          }
        },
        {
          "match": {
            "custom_attributes.attributes.value": {
              "query": "val1 val2 val3 val4",
              "operator": "and",
              "analyzer": "whitespace"
            }
          }
        }
      ]
    }
  }
}