Asking for significant terms but returns nothing

777 Views Asked by At

I am having an issue with Elasticsearch (version 2.0), I am trying to get the significant terms from a bunch of documents but it always returns nothing.

Here is the schema of my index :

{
    "documents" : {
      "warmers" : {},
      "mappings" : {
         "document" : {
            "properties" : {
               "text" : {
                  "index" : "not_analyzed",
                  "type" : "string"
               },
               "entities": {
                   "properties": {
                       "text": {
                           "index": "not_analyzed",
                           "type": "string"
                       }
                   }
               }
            }
         }
      },
      "settings" : {
         "index" : {
            "creation_date" : "1447410095617",
            "uuid" : "h2m2J9sJQaCpxvGDI591zg",
            "number_of_replicas" : "1",
            "version" : {
               "created" : "2000099"
            },
            "number_of_shards" : "5"
         }
      },
      "aliases" : {}
   }
}

So it's a simple index that contains the field text, which is not analyzed, and an array entities that will contains dictionnaries with a single field: text, which is not analyzed neither.

What I want to do is to match some of the documents and extracts the most significant terms from the entities associated. For that, I use a wildcard and then an aggregation.

Here is the the request I am sending through curl:

curl -XGET 'http://localhost:9200/documents/_search' -d '{
        "query": {
            "bool": {
               "must": {"wildcard": {"text": "*test*"}}
            }
        },
        "aggregations" : {
                "my_significant_terms" : {
                        "significant_terms" : { "field" : "entities.text" }
                }
        }
}'

Unfortunately, even if Elasticsearch is hitting on some documents, the buckets of the significant terms aggregation are always empty.

I tried to put analyzed instead of not_analyzed also, but I got the same empty results.

So first, is it relevant to do it this way ?

I am a very beginner to Elasticsearch, so, can you explain me how the significant terms aggregations work ?

And finaly, if it is relevant, why my query isn't working ?

EDIT: I just saw in the Elasticsearch documentation that the significant terms aggregation need a certain amount of data to become effective, and I just have 163 documents in my index. Could it be that ?

2

There are 2 best solutions below

0
On

Not sure if it will help. Try to specify

"min_doc_count" : 1

0
On

the significant terms aggregation need a certain amount of data to become effective, and I just have 163 documents in my index. Could it be that ?

Using 1 shard not 5 will help if you have a small number of docs.