How to significantly minimize the elasticsearch query execution time of the query that has aggregation with script

277 Views Asked by At

I have around 17 million documents(Its gradually increasing) in the elastic-search index, Mapping of one of the property labels that is used for aggregation is

{
   "mappings":{
      "labels":{
         "properties":{
            "label":{
               "type":"text",
               "fields":{
                  "raw":{
                     "type":"keyword"
                  }
               }
            },
            "count":{
               "type":"float"
            }
         }
      }
   }
}

Each document has more than 500 items in that labels attribute

Now while aggregating the document with query

{
  "query": {
    "bool": {
      "must": [
        {
          "match_phrase": {
            "type": "XYZ"
          }
        }
      ]
    }
  },
  "aggs": {
    "date": {
      "range": {
        "field": "date",
        "ranges": [
          {
            "from": 1577816100000,
            "to": 1609438500000
          },
          {
            "from": 1546280100000,
            "to": 1577816100000
          }
        ]
      },
      "aggs": {
        "field1": {
          "terms": {
            "field": "field1",
            "size": 100
          },
          "aggs": {
            "agg_label": {
              "terms": {
                "field": "labels.label.raw",
                "size": 250,
                "min_doc_count": 5
              },
              "aggs": {
                "sum1": {
                  "sum": {
                    "script": "_score"
                  }
                },
                "sum2": {
                  "sum": {
                    "field": "labels.count"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

It takes around 20 seconds, and the higher the number of values in that field labels higher is the number of execution time.

I know script query is expensive, So is there any way I can significantly minimize the executuion time?

0

There are 0 best solutions below