I'm trying to normalize a string that has accent characters. It runs fine on my intellij IDE, but when i build it using maven and deploy the war in tomcat, I get unexpected results like this. Can you please help?
Java code to normalize
String normalizedString = Normalizer.normalize(inputText, Normalizer.Form.NFD).replaceAll("[^\\p{ASCII}]", "");
Output from tomcat logs:
Input text = ůňa
Normalized String = AAa
Output when I run the same code on my local machine in an IDE
Input text = ůňa
Normalized String = una
Do I need to specify some encoding setting somewhere?
My maven has this:
#<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${maven-compiler-plugin.version}</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
This is present in my server.xml in tomcat
<Connector port="8443"
protocol="org.apache.coyote.http11.Http11NioProtocol"
SSLEnabled="true"
maxThreads="150"
scheme="https"
secure="true"
clientAuth="false"
sslProtocol="TLS"
URIEncoding="UTF-8"
/>
I was able to solve this. I was reading the data from a file and encoding was not mentioned while reading the file. Once I put that, issue got fixed