I need to compare strings in my program without considering special national characters, so e.g. "C" and "Č" should be the same. I used Collator class. For first and second case it works like expected, but in third and fourth case no.
package collator;
import java.text.Collator;
import java.util.Locale;
public class Coll {
public static void main(String[] args) {
Locale locale = new Locale("sk", "SK");
Collator collator = Collator.getInstance(locale);
collator.setStrength(Collator.PRIMARY);
System.out.println(collator.compare("T", "Ť"));
System.out.println(collator.compare("L", "Ľ"));
System.out.println(collator.compare("C", "Č"));
System.out.println(collator.compare("S", "Š"));
}
}
I expect 0 0 0 0, but actual output is 0 0 -1 -1
Check out the
java.text.Normalizerclass. I have never used it extensively but it looks like it could be useful for your purpose. Example:Read this blog post for more information: normalizing-text-in-java