Why don't I need to import modules in scripts which are not the main files?

108 Views Asked by At

I have a python script and want to query a database with duckdb. Here is my main.py file:

import duckdb
import pandas as pd
import utils.func

con = duckdb.connect(database="db.duckdb", read_only=False)

data = {
    "id": ["1","2","3","4","5","6",],
    "group": ["A","A","A","B","B","B",],
    "names": ["aze","www","ttt","xxxxxx","llllll","ggggggg",],
}

data_df = pd.DataFrame(data)
con.execute("CREATE TABLE IF NOT EXISTS data AS SELECT * FROM data_df")

my_query="select * from data"

res = utils.func.query(con, my_query)
print(res)

As you can see, I have a function to execute my query. this functions has 2 arguments (the connection to my db and the query)

This function is written in another file named func.py and I import it as a module.

Here is func.py:

def query(DB_connection, query):
    print('appel')
    result=DB_connection.execute(query).df()
    return result

In func.py, I did not import duckdb, nor pandas. There is only the function. with the connection the db, I use the method execute() from duckdb to execute my query. and then I use the method df() from pandas to turn the result into a dataframe.

When I run main.py, it works but I am surprised because I thought it should not work because of func.py which is not supposed to know how to use the method execute() or the method df().

Does it mean I only need to import modules in the main file?

2

There are 2 best solutions below

2
On

If I got your question right, then the answer it's yes, the reason because you don't need to import it in your file it's because it's already imported in the main.py

Python it's quite a weird language with a lot of hacks, and basically the interpreter keep the imported libraries available after load in a module.

That said, I will not relay in that behaviour as your code will not be really reusable (will only work if the caller has already imported those libraries) and furthermore, some IDEs will show errors due the missing imports.

So even if works it's better to add the imports into your python file.

0
On

The func module, or more precise the query() function doesn't need to know how to use specific methods, it just needs to know how to call methods. It gets an object DB_connection as argument and calls the execute() method of that object with an argument query which also came as an argument into this function. What happens then is the responsibility of the DB_connection object. There's no need to know what type that object actually is, as long as it has an execute() method that behaves as expected and returns something with a df() method.

If a method doesn't exist, you'll get am AttributeError for that method. If a method does exist, but is called with wrong arguments, that method will most likely run into an exception because of that.

There is nothing more a module needs to know about objects given to a function or method defined in that module as arguments.

Here is the disassembled bytecode in Python 3.8 for that function:

>>> dis.dis(query)
  2           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('appel')
              4 CALL_FUNCTION            1
              6 POP_TOP

  3           8 LOAD_FAST                0 (DB_connection)
             10 LOAD_METHOD              1 (execute)
             12 LOAD_FAST                1 (query)
             14 CALL_METHOD              1
             16 LOAD_METHOD              2 (df)
             18 CALL_METHOD              0
             20 STORE_FAST               2 (result)

  4          22 LOAD_FAST                2 (result)
             24 RETURN_VALUE

Line 3 is compiled to just calling methods by name.