Alternatives to `asciifolding` filter for removing Greek ascents from unicode text

69 Views Asked by user2173353 At 01 July 2025 at 15:18

I see that the asciifolding filter of OpenSearch only handles Latin accents and does not handle Greek at all (note: some accents are not rendered well in this site due to the font used):

POST /_analyze
{
  "text": [ "Latin: ấ ê ŏ õ ô ì / Greek: ἆ ᾧ ῦ ἄ ἒ " ],
  "filter": [
    "asciifolding"
    ]
}

{
  "tokens": [
    {
      "token": "Latin: a e o o o i / Greek: ἆ ᾧ ῦ ἄ ἒ ",
      "start_offset": 0,
      "end_offset": 38,
      "type": "word",
      "position": 0
    }
  ]
}

Is there any other filter that can handle Unicode characters, that I can use to process Greek and remove accents/diacritics, or I will have to roll out my own?

I have found those two alternative ways to achieve my goal, but I was hoping that something built-in would exist for something so basic:

Any hints or ideas are welcome.

Original Q&A

Alternatives to `asciifolding` filter for removing Greek ascents from unicode text

There are 0 best solutions below

Related Questions in DIACRITICS

Related Questions in NON-ASCII-CHARACTERS

Related Questions in OPENSEARCH

Related Questions in ACCENT-INSENSITIVE

Related Questions in UNACCENT

Trending Questions

Popular # Hahtags

Popular Questions