Saving text with Polish characters (utf-8) to a textfile from JSON in Python

177 Views Asked by At

I am trying to save a conversation from Messenger to a textfile, including things like timestamps and senders. In the JSON file downloaded from Messenger, the emojis and Polish characters are displayed as UTF-8 in literal (e.g. "ą" as \xc4\x85). After executing this program:

import json
from datetime import datetime

messages = []
jsonfiles = ["message_1.json","message_2.json","message_3.json","message_4.json","message_5.json", "message_6.json","message_7.json","message_8.json","message_9.json","message_10.json","message_11.json"]

def filldict(textfile,jsonfile):
    with open(textfile,"a", encoding="utf-8") as w:
        with open(jsonfile, "r", encoding="utf-8") as j:
            data = json.load(j)
            i = 0
            while i<len(data["messages"]):
                message = {}
                if "content" in data["messages"][len(data["messages"])-1-i]:
                    stamp = int(data["messages"][len(data["messages"])-1-i]["timestamp_ms"])
                    date = datetime.fromtimestamp(stamp/1000)
                    message['timestamp']=stamp
                    message['date']=date
                    w.write(str(date))
                    w.write(" ")
                    w.write(data["messages"][len(data["messages"])-1-i]["sender_name"])
                    message['sender']=data["messages"][len(data["messages"])-1-i]["sender_name"]
                    w.write(": ")
                    if "content" in str(data["messages"][len(data["messages"])-1-i]):
                        w.write(data["messages"][len(data["messages"])-1-i]["content"])
                        message['content']=data["messages"][len(data["messages"])-1-i]["content"]
                        w.write("\n")
                i +=1
                messages.append(message)
                message = {}

j = len(jsonfiles)
while j>0:
    filldict("messages11.txt", jsonfiles[j-1])
    j-=1

print("process finished")

the output textfile contains those utf-8 literals instead of the characters which they represent. What can I do in order to fix it and display the Polish characters (and, if that's even possible, emojis) in the textfile? I thought that including " encoding = 'utf-8' " would be enough. Thank you for any clues.

0

There are 0 best solutions below