Elasticsearch doesn't apply the NOT filter

3.1k Views Asked by At

I've been knocking my head against a wall with Elasticsearch today, trying to fix a failing test case.

I am using Rails 3.2.14, Ruby 1.9.3, the Tire gem and ElasticSearch 0.90.2

The objective is to have the query return matching results EXCLUDING the item where vid == "ABC123xyz"

The Ruby code in the Video model looks like this:

def related_videos(count)
  Video.search load: true do
    size(count)
    filter :term, :category_id => self.category_id
    filter :term, :live => true
    filter :term, :public => true
    filter :not, {:term => {:vid => self.vid}}
    query do
      boolean do
        should { text(:_all, self.title, boost: 2) }
        should { text(:_all, self.description) }
        should { terms(:tags, self.tag_list, minimum_match: 1) }
      end  
   end
  end
end

The resulting search query generated by Tire looks like this:

{
  "query":{
    "bool":{
      "should":[
        {
          "text":{
            "_all":{
              "query":"Top Gun","boost":2
            }
          }
        },
        {
          "text":{
            "_all":{
              "query":"The macho students of an elite US Flying school for advanced fighter pilots compete to be best in the class, and one romances the teacher."
            }
          }
        },
        {
          "terms":{
            "tags":["top-gun","80s"],
            "minimum_match":1
          }
        }
      ]
    }
  },
  "filter":{
    "and":[
      {
        "term":{
          "category_id":1
        }
      },
      {
        "term":{
          "live":true
        }
      },
      {
        "term":{
          "public":true
        }
      },
      {
        "not":{
          "term":{
            "vid":"ABC123xyz"
          }
        }
      }
    ]
  },
  "size":10
}

The resulting JSON from ElasticSearch:

{
  "took": 7,
  "timed_out": false,
  "_shards":{
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total":1,
    "max_score":0.2667512,
    "hits":[
      {
        "_index":"test_videos",
        "_type":"video",
        "_id":"8",
        "_score":0.2667512,
        "_source":{
          "vid":"ABC123xyz",
          "title":"Top Gun",
          "description":"The macho students of an elite US Flying school for advanced fighter pilots compete to be best in the class, and one romances the teacher.",
          "tags":["top-gun","80s"],
          "category_id":1,
          "live":true,
          "public":true,
          "featured":false,
          "last_video_view_count":0,
          "boost_factor":0.583013698630137,
          "created_at":"2013-08-28T14:24:47Z"
        }
      }
    ]
  }
}

Could somebody help! The docs for Elasticsearch are sparse around this topic and I'm running out of ideas.

Thanks

1

There are 1 best solutions below

3
On

Using a top-level filter the way you are doesn't filter the results of your query - it just filters results out of things like facet counts. There's a fuller description in the elasticsearch documentation for filter.

You need to do a filtered query which is slightly different and filters the results of your query clauses:

Video.search load: true do
  query do
    filtered do
      boolean do
        should { text(:_all, self.title, boost: 2) }
        should { text(:_all, self.description) }
        should { terms(:tags, self.tag_list, minimum_match: 1) }
      end  
      filter :term, :category_id => self.category_id
      filter :term, :live => true
      filter :term, :public => true
      filter :not, {:term => {:vid => self.vid}}
    end
  end
end