Python3.7 & Windows : incorrect unicode characters in docstrings in interactive mode

176 Views Asked by At

Save following program as test.py:

def f():
    """àâùç"""
    return
print("àâùç")

and execute it in a Windows cmd-window in interactive mode:

python -i test.py

The printed text is correct, but when I call help(f) I get scrambled eggs:

P:\>python -i test.py
àâùç
>>> help(f)
Help on function f in module __main__:

f()
    ÓÔ¨þ

Changing the codepage to 65001 brings up classical mystery cards instead:

P:\>python -i test.py
àâùç
>>> help(f)
Help on function f in module __main__:

f()
    ����

Is there any (easy) workaround ?

1

There are 1 best solutions below

1
On BEST ANSWER

help() has two bugs where the implementation of the pager is to write to a temp file and shell out to more. From pydoc.py:

def tempfilepager(text, cmd):
    """Page through text by invoking a program on a temporary file."""
    import tempfile
    filename = tempfile.mktemp()
    with open(filename, 'w', errors='backslashreplace') as file:
        file.write(text)
    try:
        os.system(cmd + ' "' + filename + '"')
    finally:
        os.unlink(filename)

The file is opened with default file encoding (cp1252 on U.S. and Western European Windows) which won't support characters outside the Windows-1252 character set (don't make Chinese help documentation, for example), and then shells out to a command (in this case, more) to handle paging. more uses the encoding of the terminal (OEM ANSI: default cp850 in Western Europe and cp437 in the US) so help will look corrupt for most characters outside the ASCII set.

Changing the terminal code page with chcp 1252 will print the characters correctly:

C:\>chcp 850
Active code page: 850

C:\>py -i test.py
àâùç
>>> help(f)
Help on function f in module __main__:

f()
    ÓÔ¨þ

>>> ^Z


C:\>chcp 1252
Active code page: 1252

C:\>py -i test.py
àâùç
>>> help(f)
Help on function f in module __main__:

f()
    àâùç

>>>