Error when inserting special characters such as Ñ and accents á, é, í, ó, ú. using BULK OPENROWSET from python

44 Views Asked by At

I am having problems inserting information into a table in my SQLSERVER 2008 database from my python project using OPENROWSET BULK and even though the CSV file generates special characters such as Ñ, á, é, etc... When inserting it into the database these characters are corrupted.

This is where I create my CSV file:

dfClientes.to_csv(cvs_path, index=False, sep=';', encoding='utf-8-sig')

I run a shutil copy to move that file to a specific path

shutil.copy(cvs_path, ruta_destinoCSV)

and I run the OPENROWSET BULK:

              bulk_insert_query = f'''INSERT INTO sgCliente (IdCliente,NomCliente,ApeCliente,NumDocumento,IdTipoDoc,NomTipoDoc,CodTipoDoc,Tip_docu,Sexo,FactorRh,GrupSangre,EstadoCivil,FecNacimiento,eMail,TelMovil,PrimerNombre,SegundoNombre,PrimerApellido,IndCentroExcelencia,IndDatosPersonal,IndHabilitado,IdTercero,CodUES)
                    SELECT * FROM  OPENROWSET(
                    BULK '{ ruta_destinoCSV }',
                    FORMATFILE='{ ruta_destinoXML }',
                    CODEPAGE = 'UTF-8-SIG',    
                    FIRSTROW = 2) as t
                '''

This is the content of my XML fiel that I use as a formatfile:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <RECORD>
      <FIELD ID="1" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="2" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="101" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="3" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="101" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="4" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="20" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="5" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="6" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="30" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="7" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="2" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="8" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="3" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="9" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="254" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="10" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="254" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="11" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="254" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="12" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="254" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="13" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="8"/>
      <FIELD ID="14" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="15" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="15" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="16" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="17" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="18" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="19" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" />
      <FIELD ID="20" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" />
      <FIELD ID="21" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" />
      <FIELD ID="22" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" />
      <FIELD ID="23" xsi:type="CharTerm" TERMINATOR='\r\n' MAX_LENGTH="5" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
   </RECORD>
   <ROW>
      <COLUMN SOURCE="1" NAME="IdCliente" xsi:type="SQLINT"/>
      <COLUMN SOURCE="2" NAME="NomCliente" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="3" NAME="ApeCliente" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="4" NAME="NumDocumento" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="5" NAME="IdTipoDoc" xsi:type="SQLTINYINT"/>
      <COLUMN SOURCE="6" NAME="NomTipoDoc" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="7" NAME="CodTipoDoc" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="8" NAME="Tip_docu" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="9" NAME="Sexo" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="10" NAME="FactorRh" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="11" NAME="GrupSangre" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="12" NAME="EstadoCivil" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="13" NAME="FecNacimiento" xsi:type="SQLDATETIME"/>
      <COLUMN SOURCE="14" NAME="eMail" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="15" NAME="TelMovil" xsi:type="SQLVARYCHAR"/>
      <COLUMN SOURCE="16" NAME="PrimerNombre" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="17" NAME="SegundoNombre" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="18" NAME="PrimerApellido" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="19" NAME="IndCentroExcelencia" xsi:type="SQLBIT"/>
      <COLUMN SOURCE="20" NAME="IndDatosPersonal" xsi:type="SQLBIT"/>
      <COLUMN SOURCE="21" NAME="IndHabilitado" xsi:type="SQLBIT"/>
      <COLUMN SOURCE="22" NAME="IdTercero" xsi:type="SQLINT"/>
      <COLUMN SOURCE="23" NAME="CodUES" xsi:type="SQLVARYCHAR"/>
   </ROW>
</BCPFORMAT>

These are a few lines from my CSV file

This is what it looks like in my database

1

There are 1 best solutions below

1
ANDREA CAROLINA CAMACHO JULIO On

The solution is very simple, thank you to @THOM A.

Sql server 2008 has no support for UTF-8 so i have to encoding in ANSI, so the solution is change this:

dfClientes.to_csv(cvs_path, index=False, sep=';', encoding='UTF-8-SIG') 

for this:

dfClientes.to_csv(cvs_path, index=False, sep=';', encoding='ANSI')