I'm trying to collect all data on an SQL Server bdd hosted in Cloud SQL with a Cloud Function.
I use pytds and pandas to collect the data from SQL Server and I'm using pandas_gbq.to_gbq to insert data into Google Big Query.
I've an issue with some tables with string columns.
The error gets raised when I try to do df = pd.read_sql(query, pool.connect())
Here is my code :
# initialize Connector object
connector = Connector()
def getconn() -> pytds.Connection:
conn = connector.connect(
instance_connection_name,
"pytds",
user=db_user,
password=db_pass,
db=db_name,
bytes_to_unicode=True
)
return conn
pool = sqlalchemy.create_engine(
"mssql+pytds://",
creator=getconn,
pool_size = 20,
max_overflow = 0,
pool_timeout = 600,
pool_recycle = 1800,
connect_args={"encoding": "latin1"}
# ...
)
query = ('select * from article')
df = pd.read_sql(query, pool.connect()). # Issue with some table (like article for this example)
I have already try to specify the encoding of the database in the create_engine order but it's not sufficient.
Thanks
I was able to get it working with the following package versions:
I made a few slight changes to the code because the
queryhas to besqlalchemy.text()format: