I have a python script and want to query a database with duckdb
.
Here is my main.py file:
import duckdb
import pandas as pd
import utils.func
con = duckdb.connect(database="db.duckdb", read_only=False)
data = {
"id": ["1","2","3","4","5","6",],
"group": ["A","A","A","B","B","B",],
"names": ["aze","www","ttt","xxxxxx","llllll","ggggggg",],
}
data_df = pd.DataFrame(data)
con.execute("CREATE TABLE IF NOT EXISTS data AS SELECT * FROM data_df")
my_query="select * from data"
res = utils.func.query(con, my_query)
print(res)
As you can see, I have a function to execute my query. this functions has 2 arguments (the connection to my db and the query)
This function is written in another file named func.py
and I import it as a module.
Here is func.py:
def query(DB_connection, query):
print('appel')
result=DB_connection.execute(query).df()
return result
In func.py
, I did not import duckdb
, nor pandas
. There is only the function.
with the connection the db, I use the method execute()
from duckdb
to execute my query.
and then I use the method df()
from pandas to turn the result into a dataframe.
When I run main.py
, it works but I am surprised because I thought it should not work because of func.py
which is not supposed to know how to use the method execute()
or the method df()
.
Does it mean I only need to import modules in the main file?
If I got your question right, then the answer it's yes, the reason because you don't need to import it in your file it's because it's already imported in the main.py
Python it's quite a weird language with a lot of hacks, and basically the interpreter keep the imported libraries available after load in a module.
That said, I will not relay in that behaviour as your code will not be really reusable (will only work if the caller has already imported those libraries) and furthermore, some IDEs will show errors due the missing imports.
So even if works it's better to add the imports into your python file.