This application runs on a mac only and I'm stuck with Python 2.
I have an input string '한글' which when decoded through an online unicode converter shows as \u1112\u1161\u11ab\u1100\u1173\u11af
For my application to work, I need to convert this somehow to '한글', shown as \ud55c\uae00.
I believe I need to normalise to NFC.
I've tried
unicodedata.normalize('NFC', myString)
but I get the following:
TypeError: normalize() argument 2 must be unicode, not str
I've tried
unicodeMyString = myString.decode('utf-8', 'ignore')
unicodedata.normalize('NFC', unicodeMyString)
but I get the following:
ERROR: Code 1: Traceback (most recent call last):
File "code.py", line 222, in <module>
sys.stdout.write(unicodedata.normalize('NFC', unicodeMyString))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
Is there any way to convert in Python 2? Thanks!
SOLVED:
The problem isn't normalising, it's printing it to the terminal - python2 tries to encode the output to ASCII. Does PYTHONIOENCODING=UTF-8 python myscript.py work? – snakecharmerb 20 mins ago