I am following the post Creating an index Nest and trying to update my index settings. All runs fine however the html_strip
filter is not stripping HTML. My code is
var node = new Uri(_url + ":" + _port);
var settings = new ConnectionSettings(node);
settings.SetDefaultIndex(index);
_client = new ElasticClient(settings);
//to apply filters during indexing use folding to remove diacritics and html strip to remove html
_client.UpdateSettings(
f = > f.Analysis(descriptor = > descriptor
.Analyzers(
bases = > bases
.Add("folded_word", new CustomAnalyzer
{
Filter = new List < string > { "icu_folding", "trim" },
Tokenizer = "standard"
}
)
)
.CharFilters(
cf = > cf.Add("html_strip", new HtmlStripCharFilter())
)
)
);
You are getting error:
Before you will try to update settings, close index first, update settings and reopen afterwards. Have a look.
UPDATE
Add
html_strip
char filter to you custom analyzer:Now you can run test to check if this analyzer returns correct tokens:
Output:
Hope it helps.