Python PyODBC and SQL Server encoding issue

53 Views Asked by RidingBoi At 13 March 2024 at 13:48

I have json file with values, that have characters like 'č', 'ė', etc. I want to insert those values to SQL Server database, but values I see in database are 'Ä' instead of 'č', 'Ä—' instead of 'ė', 'ÄŒ' instead of 'Č', etc. How should I deal with this problem?

I tried things, that were on StackOverflow and pyodbc wiki, like:

self.connection.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8')
self.connection.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8')
self.connection.setencoding(encoding='utf-8')

(also tried latin1 encoding, because I saw somewhere, that it could help).

I tried adding charset to connection string:

connection_string = (
    f'DRIVER={{ODBC Driver 18 for SQL Server}};'
    f'SERVER={server};'
    f'DATABASE={database};'
    'Trusted_Connection=yes;'
    'Encrypt=no;'
    'CHARSET=UTF-8;'
)

But nothing solved my problem.

This is how I read my file:

def read_values(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        return json.load(file)

This is how I insert my data:

def insert(self, table_name: str, column_values: dict):
    columns = ', '.join(column_values.keys())
    values = ', '.join([f"'{value}'" for value in column_values.values()])

    query = f"INSERT INTO {table_name} ({columns}) VALUES ({values})"
    print(query)

    try:
        cursor = self.connection.cursor()
        cursor.execute(query)
        cursor.execute("SELECT @@IDENTITY AS ID;")
        last_inserted_id = cursor.fetchone()[0]
        self.connection.commit()
        print("Data inserted successfully.")
        return last_inserted_id
    except Exception as e:
        print(f"Error inserting data: {e}")
    finally:
        cursor.close()

print(query) prints the right query, with right characters.

If I write the right characters directly into the database with my keyboard, everything works fine.

Original Q&A

There are 1 best solutions below

Gord Thompson On 13 March 2024 at 16:59

Issue 1: Don't use `.setencoding()` and `.setdecoding()` for SQL Server.

As mentioned in the pyodbc wiki entry for SQL Server, they are not necessary.

Issue 2: Don't use string formatting (a.k.a. dynamic SQL) to inject column values into an SQL statement.

The real problem you are having is that you are constructing an INSERT statement that looks something like this:

print(query)
# INSERT INTO so78154305 (firstname) VALUES ('Renée')

SQL Server uses 'string' for varchar and N'string' for nvarchar, so a Unicode literal should be N'Renée'.

But that would still be bad form. Use a parameterized query instead. For example

columns = ', '.join([f"[{key}]" for key in column_values.keys()])
placeholders = ', '.join("?" * len(column_values.values()))

query = f"INSERT INTO [{table_name}] ({columns}) VALUES ({placeholders})"
print(query)
# INSERT INTO [so78154305] ([firstname], [active]) VALUES (?, ?)

cursor.execute(query, tuple(column_values.values()))

Python PyODBC and SQL Server encoding issue

There are 1 best solutions below

Issue 1: Don't use `.setencoding()` and `.setdecoding()` for SQL Server.

Issue 2: Don't use string formatting (a.k.a. dynamic SQL) to inject column values into an SQL statement.

Related Questions in PYTHON

Related Questions in SQL-SERVER

Related Questions in CHARACTER-ENCODING

Related Questions in PYODBC

Trending Questions

Popular # Hahtags

Popular Questions

Python PyODBC and SQL Server encoding issue

There are 1 best solutions below

Issue 1: Don't use .setencoding() and .setdecoding() for SQL Server.

Issue 2: Don't use string formatting (a.k.a. dynamic SQL) to inject column values into an SQL statement.

Related Questions in PYTHON

Related Questions in SQL-SERVER

Related Questions in CHARACTER-ENCODING

Related Questions in PYODBC

Trending Questions

Popular # Hahtags

Popular Questions

Issue 1: Don't use `.setencoding()` and `.setdecoding()` for SQL Server.