How can I prevent cached modules/variables when using runpy in pytest tests?

174 Views Asked by At

(Preface: This is a toy example to illustrate an issue that involves much larger scripts that use a ton of modules/libraries that I don't have control over)

Given these files:

# bar.py
barvar = []

def barfun():
    barvar.append(1)


# foo.py
import bar

foovar = []
def foofun():
    foovar.append(1)

if __name__ == '__main__':

    foofun()
    bar.barfun()
    foovar.append(2)
    bar.barvar.append(2)

    print(f'{foovar    =}')
    print(f'{bar.barvar=}')
    
# test_foo.py
import sys
import os
import pytest
import runpy

sys.path.insert(0,os.getcwd()) # so that "import bar" in foo.py works

@pytest.mark.parametrize('execution_number', range(5))
def test1(execution_number):
    print(f'\n{execution_number=}\n')
    sys.argv=[os.path.join(os.getcwd(),'foo.py')]
    runpy.run_path('foo.py',run_name="__main__")

If I now run pytest test_foo.py -s I will get:

========================================================================
platform win32 -- Python 3.10.8, pytest-7.2.0, pluggy-1.0.0
rootdir: C:\Temp
plugins: anyio-3.6.2
collected 5 items

test_foo.py
execution_number=0

foovar    =[1, 2]
bar.barvar=[1, 2]
.
execution_number=1

foovar    =[1, 2]
bar.barvar=[1, 2, 1, 2]
.
execution_number=2

foovar    =[1, 2]
bar.barvar=[1, 2, 1, 2, 1, 2]
.
execution_number=3

foovar    =[1, 2]
bar.barvar=[1, 2, 1, 2, 1, 2, 1, 2]
.
execution_number=4

foovar    =[1, 2]
bar.barvar=[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
.

========================================================================

So barvar is remembering its previous content. This is obviously detrimental to testing.

Can it be prevented while still using runpy?

Understandably, python docs warn about runpy side effects:

Note that this is not a sandbox module - all code is executed in the current process, and any side effects (such as cached imports of other modules) will remain in place after the functions have returned.

If this is tricky or too complicated to do reliably, are there alternatives? I am looking for the convenience of testing scripts that take arguments and produce stuff (usually files). My typical pytest test script sets up arguments via sys.argv then runs via runpy the target script (very large programs with lots of imports), then validates the generated files (e.g., compare against a baseline for regression testing). There are many invocations within a single test run; hence the need for a clean slate.

subprocess.run(['python.exe', 'script.py', *arglist]) is another option I can think of.

Thanks.

2

There are 2 best solutions below

0
On

Simple pragmatic solution, evict the "cached" bar module, if any, in test setup:

@pytest.fixture(autouse=True)
def evict_bar():
    sys.modules.pop("bar", None)
0
On

If you are not able to refactor your code or you don't want to remove the module from sys.modules (both solutions that would work). You could just patch the barvar variable setting the initial state so an empty list for each test execution.

from unittest import mock

@pytest.mark.parametrize('execution_number', range(5))
def test1(execution_number):
    with mock.patch('bar.barvar', new_callable=list):
        print(f'\n{execution_number=}\n')
        sys.argv=[os.path.join(os.getcwd(),'foo.py')]
        runpy.run_path('foo.py',run_name="__main__", init_globals={'bar.barvar': []})

Having said that ...

I truly recommend you consider removing the global variables, if for the test you design your code is not behaving as expected, then you should fix the code, not the test, that's all tests are about.

Looking at your test, you will be running your script more that once, and you expect at the start of every run to have barvar being an empty list.

So, your test is Ok, you need to fix your code. ;)