I have an issue removing diacritics from an filename provided by FrontEnd. I found this post here where it solved the problem.
BUT! ^^
It's not doing the job for me on all my Backend Application. We are in .net 6 architecture and on a console application based from standard microsoft template it work well...
Standalone console application run on Windows architecture meanwhile our project run on linux docker.
Does anyone have an idea why there is a difference?
Issue: It remove the diacritics instead of replacing it I mean the "é" char when normalizing does not split into 2 char ("e" & "'") but keep it at 233 Ascii range
Exemple:
my filename contains Diacritics "Rémi.pdf" and use this:
mystring.RemoveDiacritics().RemoveNonAscii();
public static string RemoveDiacritics(this string text)
{
string text2 = text.Normalize(NormalizationForm.FormD);
StringBuilder stringBuilder = new StringBuilder(text2.Length);
foreach (char c in text2)
{
if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
{
stringBuilder.Append(c);
}
}
return stringBuilder.ToString().Normalize(NormalizationForm.FormC);
}
public static string RemoveNonAscii(this string text)
{
return Regex.Replace(text, "[^\\u0000-\\u007F]+", string.Empty);
}
Resolved: Add this 2 lines in docker file
- RUN apk add --no-cache icu-libs
- RUN apk add --no-cache icu-data-full
Add false in csproj