Indexing documents only with numeric fields in elasticsearch

321 Views Asked by At

I am trying to store objects in elasticsearch which are represented by only numeric fields. IN my case each object has 300 float fields and 1 id fields. I have put the id field as not_analyzed. I am able to store the documents in ES.

 "_index": "smart_content5",
    "_type": "doc2vec",
    "_id": "AVtAGeaZjLL5cvd8z9y7",
    "_score": 1,
    "_source": {
      "feature_227": 0.0856793,
      "feature_5": -0.115823,
      "feature_119": -0.0379987,
      "feature_145": 0.17952,
      "feature_29": 0.0444945,

but now I want to run a query represented with the same 300 fields but different numerical values (of course). Now I want to find the document whose 300 fields are "most similar" to this query fields. So it is something like doing cosine similarity but I am trying to use ES for doing this so that it is fast.

(1) First of all, is it even possible to do what I am doing??

(2) Second, I have explored the function_score feature of ES and tried using that but it returns that the maximum match score is 0.0!!

Any comments on what should I use and what I might be doing wrong in [2].

1

There are 1 best solutions below

4
On BEST ANSWER

I think you still need function_score but like this (it worked for me):

{
  "query": {
    "function_score": {
      "query": {},
      "functions": [
        {
          "gauss": {
            "feature_227": {
              "origin": "0",
              "scale": "0.5"
            }
          }
        },
        {
          "gauss": {
            "feature_5": {
              "origin": "0",
              "scale": "0.5"
            }
          }
        },
        {
          "gauss": {
            "feature_119": {
              "origin": "0",
              "scale": "0.5"
            }
          }
        },
        {
          "gauss": {
            "feature_145": {
              "origin": "0",
              "scale": "0.5"
            }
          }
        },
        {
          "gauss": {
            "feature_29": {
              "origin": "0",
              "scale": "0.5"
            }
          }
        }
      ],
      "score_mode": "sum"
    }
  }
}