Replacing one unicode character in a string

39 Views Asked by At

I have a problem with an utf-8 encoded XML text. In Notepad++ it looks like this. Besides normal spaces there are some U+2002 ("ENSP") characters.

An XML-text with a non-visible unicode character

How can I get rid of them:

str.replace("u'2002'", " ") does nothing.

str.encode and str.decode are not available, because of the version of Python.

Regex and unicodedata.normalize loose Scandinavian characters and make the text generally unusable.

0

There are 0 best solutions below