I have a table with over 10 million rows in Dremio. I have connected to it from Python PYODBC. I want to run a simple query like shown below:
SELECT REPORTDATE, TRANSDATE
FROM TABLE
WHERE TRANSDATE = '2020-01-05'
The issue is that it takes forever to run this query via Python. What would be the solution for this?
I would recommend using sqlalchemy or pandas to make the call.
Personally, I use pandas (below example uses cx_Oracle since we use Oracle servers):
If it still runs too slow, you can pull it in chunks using the
chunksizeattribute:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql.html