UnicodeEncodeError printing Hangul characters in the terminal

48 Views Asked by At

This application runs on a mac only and I'm stuck with Python 2.

I have an input string '한글' which when decoded through an online unicode converter shows as \u1112\u1161\u11ab\u1100\u1173\u11af

For my application to work, I need to convert this somehow to '한글', shown as \ud55c\uae00.

I believe I need to normalise to NFC.

I've tried

unicodedata.normalize('NFC', myString) 

but I get the following:

TypeError: normalize() argument 2 must be unicode, not str

I've tried

unicodeMyString = myString.decode('utf-8', 'ignore')
unicodedata.normalize('NFC', unicodeMyString)

but I get the following:

ERROR: Code 1: Traceback (most recent call last):
  File "code.py", line 222, in <module>
    sys.stdout.write(unicodedata.normalize('NFC', unicodeMyString))  
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

Is there any way to convert in Python 2? Thanks!

SOLVED:

The problem isn't normalising, it's printing it to the terminal - python2 tries to encode the output to ASCII. Does PYTHONIOENCODING=UTF-8 python myscript.py work? – snakecharmerb 20 mins ago

0

There are 0 best solutions below