Rails Searchkick has_many indexing and searching

903 Views Asked by At

I am able to search by customer_id, name, lastname, and kids id, name, lastname and "birthdate"

The searching by id must be exact and it is. The searching by names or lastname has misspelling with distance 2 and it works. But I want to search also by kid_birthdate with match exact (mispelling, distance 0)

So far whenever I search by birthdate the results are returned like misspelling distance 2. I don't know how to search exact dates.

Rails 5.1.0.rc1

elasticsearch-5.0.3

searchkick-2.2.0

class Customer < ActiveRecord::Base
  include Searchable

  def search_data
    attributes.merge avatar_url: avatar.url, kids: kids
  end

  has_many :kids
  ...
end

class Kid < ActiveRecord::Base
    belongs_to :customer

    def reindex_customer
        customer.reindex async: true
    end 
    ...
end      

module Searchable
  extend ActiveSupport::Concern

  included do
    SEARCH_RESULTS_PER_PAGE = 10

    def self.elastic_search(query, opts = { page: 1 })
      # This regex accept string that contains digits or dates
      regexp = /(\d+)|(^(0[1-9]|1\d|2\d|3[01])-(0[1-9]|1[0-2])-(19|20)\d{2}$)/
      distance = query.match?(regexp) ? 0 : 2 #This is for calculate the distance for misspelling 0 for digits and dates and 2 for strings
      options = { load: false,
                  match: :word_middle,
                  misspellings: { edit_distance: distance },
                  per_page: SEARCH_RESULTS_PER_PAGE,
                  page: opts[:page] }
      search query, options
    end
  end
end

My index contains customer data with her/his kids data. Kids are nested under her/his parent customer. How can I force the searching for exact matching for dates

For this query:

curl http://localhost:9200/customers_development/_search?pretty -d '{"query":{"dis_max":{"queries":[{"match":{"_all":{"query":"28388","boost":10,"operator":"and","analyzer":"searchkick_search"}}},{"match":{"_all":{"query":"28388","boost":10,"operator":"and","analyzer":"searchkick_search2"}}},{"match":{"_all":{"query":"28388","boost":1,"operator":"and","analyzer":"searchkick_search","fuzziness":0,"prefix_length":0,"max_expansions":3,"fuzzy_transpositions":true}}},{"match":{"_all":{"query":"28388","boost":1,"operator":"and","analyzer":"searchkick_search2","fuzziness":0,"prefix_length":0,"max_expansions":3,"fuzzy_transpositions":true}}}]}},"size":10,"from":0,"timeout":"11s"}'

This is how the index looks:

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 97.29381,
    "hits": [
      {
        "_index": "customers_development_20170913145033808",
        "_type": "customer",
        "_id": "28388",
        "_score": 97.29381,
        "_source": {
          "id": 28388,
          "created_at": "2017-07-10T19:49:43.856Z",
          "updated_at": "2017-09-13T03:01:51.727Z",
          "name": "Linda",
          "lastname": "Schott",
          "email": "[email protected]",
          "avatar": null,
          "phone": null,
          "mobile": null,
          "erster_kontakt": null,
          "memo": null,
          "brief_title": null,
          "newsletter": null,
          "avatar_url": "/no_customer.png",
          "kids": [
            {
              "id": 34229,
              "name": "Jakob",
              "lastname": "Schott",
              "birthdate": "2013-03-22",
              "age": "4,5",
              "avatar": {
                "url": "/avatars/kid/34229/Jellyfish.png",
                "thumb": {
                  "url": "/avatars/kid/34229/thumb_Jellyfish.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.058Z",
              "updated_at": "2017-09-13T03:02:52.962Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "black",
              "score": 30,
              "current_level": "swimmys"
            },
            {
              "id": 34228,
              "name": "Lilith",
              "lastname": "Schott",
              "birthdate": "2013-03-22",
              "age": "4,5",
              "avatar": {
                "url": "/avatars/kid/34228/Penguins.png",
                "thumb": {
                  "url": "/avatars/kid/34228/thumb_Penguins.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.058Z",
              "updated_at": "2017-09-13T03:02:52.962Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "green",
              "score": 17,
              "current_level": "beginner"
            },
            {
              "id": 27718,
              "name": "Johanna",
              "lastname": "Plischke",
              "birthdate": "2010-12-29",
              "age": "6,8",
              "avatar": {
                "url": "/avatars/kid/27718/Koala.png",
                "thumb": {
                  "url": "/avatars/kid/27718/thumb_Koala.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.034Z",
              "updated_at": "2017-09-13T04:01:15.261Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "red",
              "score": 27,
              "current_level": ""
            }
          ]
        }
      }
    ]
  }
}
1

There are 1 best solutions below

0
On

Let's analyze the part of query:

"match":{
    "_all":{
        "query":"28388",
        "boost":1,
        "operator":"and",
        "analyzer":"searchkick_search",
        "fuzziness":0,
        "prefix_length":0,
        "max_expansions":3,
        "fuzzy_transpositions":true
    }
}

_all

You have said your kids is the nested field but you just search _all, so the first thing we should make it clear is whether kids is included in _all.

As the document says:

Sets the default include_in_all value for all the properties within the nested object. Nested documents do not have their own _all field. Instead, values are added to the _all field of the main “root” document.

So, first question is whether the index nested type has set include_in_all to false which makes nested fields can't be search by _all.

Nested Query

Or you can choose nested query to query nested object:

GET /_search
{
    "query": {
        "nested" : {
            "path" : "kids",
            "score_mode" : "avg",
            "query" : {
                "query_string": {
                  "fields": ["kids.birthdate"],
                  "query": "xxx"
                } 
            }
        }
    }
}

Fuzzy

When it comes to misspelling, Elasticsearch recommend us to use fuzzy query:

GET /_search
{
    "query": {
        "fuzzy" : {
            "name" : {
                "value" :         "xxx",
                 "boost" :         1.0,
                 "fuzziness" :     2,
                 "prefix_length" : 0,
                 "max_expansions": 100
            }
        }
    }
}

Combine Query

And finally, we can combine them using bool query:

POST _search
{
  "query": {
    "bool" : {
      "must" : [{
            "nested" : {
                "path" : "kids",
                "query" : {
                    "query_string": {
                      "fields": ["kids.birthdate"],
                      "query": "xxx"
                    } 
                }
            }            
       },
        {  "fuzzy" : {
                "name" : {
                    "value" :         "xxx",
                     "boost" :         1.0,
                     "fuzziness" :     2,
                     "prefix_length" : 0,
                     "max_expansions": 100
                }
           }
       }]
    }
  }
}

I am not familiar with Ruby, so that all I can help. Hope that helpful.