Croatian characters incorrectly retrieved in R via DBI

40 Views Asked by At

I'm fetching some data from a SQL Server database. On my machine I get it fine, however certain Croatian characters get retrieved incorrectly for my colleague. Here's a sample usage in R:

db_connection <- DBI::dbConnect(odbc::odbc(),
                                 Driver   = "ODBC Driver 17 for SQL Server",
                                 Server   = #servername
                                 Database = #dbname
                                 UID      = #user
                                 PWD      = #password
                                 TrustServerCertificate="yes",
                                 Port     = 1433,
                                 encoding = "Cp1250",
                                 clientcharset = "Cp1250"
                                 )
                                 
retrieved_data <- dbGetQuery(db_connection, "SELECT * FROM mytable")

When looking at text columns, characters like č and ć get transformed into c on the other computer. Other specific characters like š and ž get retrieved correctly. For me, it all worked as expected.

We're both on Windows 11, using R 4.1.3. and DBI 1.1.3. Sys.getlocale() returns the same results for both of us: "LC_COLLATE=Croatian_Croatia.1250;LC_CTYPE=Croatian_Croatia.1250;LC_MONETARY=Croatian_Croatia.1250;LC_NUMERIC=C;LC_TIME=Croatian_Croatia.1250".
We've also tried changing the encoding and clientcharset parameters but nothing seems to help. Also when viewing the same data in another program, like dbVisualizer, we both retrieve the correct data.
I'd really appreciate if anyone could figure out something that would make retrieving this database data via R behave consistently across machines.

0

There are 0 best solutions below