I'm trying elasticsearch and it looks great!
I noticed, however, a problem very uncomfortable, in a field that contains hello world
if I search hello wo
returns no result!
Why does this happen?
Place my configuration (FOSElasticaBundle):
fos_elastica:
clients:
default: { host: localhost, port: 9200 }
serializer:
callback_class: FOS\ElasticaBundle\Serializer\Callback
serializer: serializer
indexes:
website:
client: default
settings:
index:
analysis:
analyzer:
custom_search_analyzer:
type: custom
tokenizer: standard
filter : [standard, worddelimiter, stopwords, snowball, lowercase, asciifolding]
custom_index_analyzer:
type: custom
tokenizer: nGram
filter : [standard, worddelimiter, stopwords, snowball, lowercase, asciifolding]
filter:
stopwords:
type: stop
stopwords: [_italian_]
ignore_case : true
worddelimiter :
type: word_delimiter
tokenizer:
nGram:
type: nGram
min_gram: 1
max_gram: 20
types:
structure:
mappings:
name: { boost: 9, search_analyzer: custom_search_analyzer, index_analyzer: custom_index_analyzer, type: string }
Any idea on how to solve?
EDIT Here my query:
{
query: {
bool: {
must: [ ]
must_not: [ ]
should: [
{
term: {
structure.name: hello wo
}
}
]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}
EDIT 2
Ok, I don't understand this behavior ...
Now I run this query:
{
query: {
bool: {
must: [
{
term: {
structure.name: hello
}
}
{
term: {
structure.name: wo
}
}
]
must_not: [ ]
should: [ ]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}
This query is the result I wanted, but I do not understand what is the difference in putting a must with two words and two must have a word with everyone!
I could explain this behavior?
Well i need to explain you probably how its working
When you index text elastic search will try to split it to terms if text is analyzed(as its in your mapping) so in your case "hello world" will be spited to two terms "hello" and "world" when you will do term search you write term hello world which does not fit any of your two terms.
To avoid spiting to terms you can set in mapping that field name is not analyzed, then it will not be spitted to two words and will be handled as one token.
Other solution is you can multiterm query
Also when you use query_string it return result since it has different algorithm.
So depends on you needs you should use different queries, but to search by name you should use query_string, term should be used if you want to filter lets say categoryId, tags and stuff like that.