I'm using BeautifulSoup to Parse some html, with Spyder as my editor (both brilliant tools by the way!). The code runs fine in Spyder, but when I try to execute the .py file from terminal, I get an error:
file = open('index.html','r')
soup = BeautifulSoup(file)
html = soup.prettify()
file1 = open('index.html', 'wb')
file1.write(html)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in position 5632: ordinal not in range(128)
I'm running OPENSUSE on a linux server, with Spyder installed using zypper. Does anyone have any suggestions what the problem might be? Many thanks.
That is because because before outputting the result (i.e writing it to the file) you must encode it first:
See every file has an attribute
file.encoding
. To quote the docs:See the last sentence?
soup.prettify
returns a Unicode object and given this error, I'm pretty sure you're using Python 2.7 because itssys.getdefaultencoding()
isascii
.Hope this helps!