uGUI Text Field, How to Remove "Replacement Characters" (uFFFD aka �)?

1.1k Views Asked by At

Using the uGUI Text component, I'm getting "replacement characters" aka � and I can't find a way to remove them.

I'm getting a string from the Instagram api which contains unicode characters for both non-alphabet language characters (for Japanese for example) which I need.

However, the unicode characters for the emojis come in as replacement characters aka �. I don't require the emojis and they can be stripped out however, I can't find a method to do this.

I'm unable to use TextMeshPro as I'm unable generate a font asset with all the unicode characters need to display the various languages (this could be user error but when I try the process hangs).

I notice these � characters don't appear in the Inspector or console so there must be a way to ignore or remove them.

I'm setting the string like this

body.text = System.Uri.UnescapeDataString(postData.text);

I've tried a number of things that haven't worked including

    body.text = body.text.Replace('\uFFFD','\'');//doesn't work
    body.text = Regex.Replace(body.text, @"^[\ufffd]", string.Empty);//doesn't work

I've also tried breaking up the string as a char array. When I try to print to console I get this error when it hits a replacement character:

    foreach (char item in postData.text.ToCharArray())
                print(item); //Error: UTF-16 to UTF-8 conversion failed because the input string is invalid

Any help with this would be greatly appreciated! Thank you.

Unity 2018.4.4, c#

replacement characters can be seen here

1

There are 1 best solutions below

1
On

Found the answer! This post provided a solution: How do I remove emoji characters from a string?

body.text = Regex.Replace(body.text, @"\p{Cs}", "");