mysql server replaces supplementary Unicode characters with a question mark (?)

402 Views Asked by At

I am using a Mysql 5.6.27 server and python as the application. The database and tables are set to default of utf32_general_ci and the column type to utf8mb4. using Anaconda Python 2.7 on Mac OS

In my program, I am opening the mysql connection with charset='utf8mb4'. Additionally I execute the below three statements after opening the connection.

self._cursor.execute('SET NAMES utf8mb4')
self._cursor.execute("SET CHARACTER SET utf8mb4")
self._cursor.execute("SET character_set_connection=utf8mb4")

My problem is that when I insert rows to the table, it replaces the supplementary unicode characters with a ?. I have looked around for answers, and there doesn't seem to be any recommendations beyond what I have already implemented (as detailed in this question).

As an e.g. the unicode character U+2620 'SKULL AND CROSSBONES' was inserted as a proper unicode but the character 'FALLEN LEAF' (U+1F342) got replaced by a ?.

Not sure if the actual python code that leads to mysql insert is relevant, but I will put it in just in case.

Please help.

--Python Code for function that calls mysql insert--

def insert_multiple_to_table(self, alliance_data_list):
  list_tuple = []
  for data in data_list:
    dict_obj = dict.fromkeys(COLUMNS)
    for name, value in data.iteritems():
        if value == 'None':
          value = 'null'
        dict_obj[name] = value
    data_tuple = tuple(dict_obj[item] for item in COLUMNS)  #create an ordered list
    list_tuple.append(data_tuple)
  format_strings = " (" + ','.join(COLUMNS) + ") "
  insert_query = "insert into " + self._tblname + format_strings + " VALUES (" + ','.join(['%s'] * len(COLUMNS)) + ")"
  result = self._db.executemany(insert_query, list_tuple)
  return result
1

There are 1 best solutions below

4
On

For fallen leaf, you need utf8mb4, not just utf8.

Python needs # -*- coding: utf-8 -*-

Check that the column in the table is CHARACTER SET utf8mb4. Check with SHOW CREATE TABLE.

The connection should be something like

db = MySQLdb.connect(host=DB_HOST, user=DB_USER, passwd=DB_PASS,
        db=DB_NAME, charset="utf8mb4", use_unicode=True)

(If that complains, switch to utf8; I am unclear whether that clause is aimed at MySQL or the outside world.)

See also Python notes.