Chewy (Elasticsearch) - Multiple emails with uax_url_email

113 Views Asked by At

I have a simple email analyzer

analyzer: {
    email: {
      tokenizer: 'uax_url_email',
      filter: ['lowercase']
    }
  }

And a field with multiple email values:

  field :emails,
        type: :text,
        analyzer: 'email',
        search_analyzer: 'email',
        value: -> (user) { [user.email, user.lead_requests.pluck(:email)].flatten.compact.uniq }

After indexing it I tried to find it, I need to find it by part of email:

UsersIndex.query(wildcard: { emails: "*example.com" }).count
=> 1

But with @:

UsersIndex.query(wildcard: { emails: "*@example.com" }).count
=> 0

And wildcard not worked for full email:

UsersIndex.query(wildcard: { email: "[email protected]" }).count
=> 0

Only match can find it with full value:

UsersIndex.query(match: { emails: "[email protected]" }).count
=> 1

Seems uax_url_email not worked for it as expected.

What should I do with it to make contains search working?

1

There are 1 best solutions below

0
Amit On

I think you are doing some mistake with the field or index name, its working for me, it would be better if you can provide the data in JSON format like below to confirm

Index mapping and setting

{
    "settings": {
        "analysis": {
            "analyzer": {
                "email": {
                    "type": "custom",
                    "tokenizer": "uax_url_email",
                    "filter": [
                        "lowercase"
                    ]
                }
            }
        }
    },
    "mappings" : {
        "properties" : {
            "mail" : {
                "type" : "text",
                "analyzer" : "email",
                "search_analyzer": "email"

            },
            "mail_keyword": {
                "type" : "keyword"
            }
        }
    }
}

Index sample data

{
    "mail" : "[email protected]"
}

Wildcard search query

{
    "query" : {
        "wildcard" : {
            "mail" : "[email protected]" (note exact same mail which I indexed)
        }
    }
}

And search result

 "hits": [
            {
                "_index": "wildcard_test",
                "_type": "_doc",
                "_id": "2",
                "_score": 1.0,
                "_source": {
                    "mail": "[email protected]"
                }
            }