word level ngram implementaion for solr >8.0

46 Views Asked by At

given word: "ABC regional private coastal area"

(shingle filter factory)tokenization i want: "ABC regional private coastal area", "ABC regional private coastal", "ABC regional private", "ABC regional", "ABC".

results: "ABC regional private coastal area", "ABC regional private coastal","ABC regional", "ABC", "regional" etc..

and some times creates tokenization i want like "regional _ coastal", "regional _ coastal area", "_ coastal"

is there any filter or tokenizer that will help me achieve this result.

already tried: edgeNGram(character level token-split), Ngram(character level token-split), Shinglefilterfactory(word leveltoken-split).

results: shingle comes close but it also creates token like word: "hello world sample" after tokenization: hello world , world, sample which gives me unecessary results for both sample and world which i dont need.

Thanks in advance.

use these links to look at the query and results [Query Performed(https://i.stack.imgur.com/TUHHn.png)]Shingle]EdgeNGram]

0

There are 0 best solutions below