elasticsearch - complex querying

416 Views Asked by At

I am using the jdbc river and I can create the following index:

curl -XPUT 'localhost:9201/_river/email/_meta' -d '{
    "type" : "jdbc",
    "jdbc" : {
        "strategy":"simple", 
        "poll":"10",
        "driver" : "org.postgresql.Driver",
        "url" : "jdbc:postgresql://localhost:5432/api_development",
        "username" : "paulcowan",
        "password" : "",
        "sql" : "SELECT id, subject, body, personal, sent_at, read_by, account_id, sender_user_id, sender_contact_id, html, folder, draft FROM emails"
    },
    "index" : {
        "index" : "email",
        "type" : "jdbc"
    },
    "mappings" : {
        "email" : {
      "properties" : {
        "account_id" : { "type" : "integer" },
        "subject" : { "type" : "string" },
        "body" : { "type" : "string" },
        "html" : { "type" : "string" },
        "folder" : { "type" : "string", "index" : "not_analyzed" },
        "id" : { "type" : "integer" }
      }
      }
    }
}'

I can run basic queries using curl like this:

curl -XGET 'http://localhost:9201/email/jdbc/_search?pretty&q=fullcontact'

I get back results

But what I want to do is restrict the results to a particular email account_id and a particular email, I run the following query:

curl -XGET 'http://localhost:9201/email/jdbc/_search' -d '{
  "query": {
    "filtered": {
      "filter": {
        "and": [
          {
            "term": {
              "folder": "INBOX"
            }
          },
          {
            "term": {
              "account_id": 1
            }
          }
        ]
      },
      "query": {
        "query_string": {
          "query": "fullcontact*"
        }
      }
    }
  }
}'

I get the following results:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

Can anyone tell me what is wrong with my query?

1

There are 1 best solutions below

0
On

It turns out that you need to use the type_mapping section to specify a field is not_analyzed in the jdbc river the normal mappings node is ignored.

Below is how it turned out:

curl -XPUT 'localhost:9200/_river/your_index/_meta' -d '{
    "type" : "jdbc",
    "jdbc" : {
        "strategy":"simple", 
        "poll":"10",
        "driver" : "org.postgresql.Driver",
        "url" : "jdbc:postgresql://localhost:5432/api_development",
        "username" : "user",
        "password" : "your_password",
        "sql" : "SELECT field_one, field_two, field_three, the_rest FROM blah"
    },
    "index" : {
        "index" : "your_index",
        "type" : "jdbc",
    "type_mapping": "{\"your_index\" : {\"properties\" : {\"field_two\":{\"type\":\"string\",\"index\":\"not_analyzed\"}}}}"
    }
}'

Strangely or annoyingly, the type_mapping section, takes a json encoded string and not a normal json node:

I can check the mappings by running:

# check mappings
curl -XGET 'http://localhost:9200/your_index/jdbc/_mapping?pretty=true'

Which should give something like:

{
  "jdbc" : {
    "properties" : {
      "field_one" : {
        "type" : "long"
      },
      "field_two" : {
        "type" : "string",
        "index" : "not_analyzed",
        "omit_norms" : true,
        "index_options" : "docs"
      },
      "field_three" : {
        "type" : "string"
      }
    }
  }
}