I would like to implement one central prefect project, where over time it will be possible to add flows independent of each other. The structure of the project is something like this:
prefect/
├── src/
│ ├── flows/
│ │ ├── test_pack1/
│ │ │ ├── common/
│ │ │ │ ├── __init__.py
│ │ │ │ └── test_module.py
│ │ │ ├── .env
│ │ │ ├── __init__.py
│ │ │ ├── requirements.txt
│ │ │ └── test_pack1_flow.py
│ │ ├── test_pack2/
│ │ │ ├── __init__.py
│ │ │ ├── .env
│ │ │ ├── requirements.txt
│ │ │ └── test_pack2_flow.py
│ │ ├── __init__.py
│ │ └── Dockerfile
│ ├── utilities/
│ │ ├── __init__.py
│ │ ├── storage.py
│ │ ├── builder.py
│ │ ├── executor.py
│ │ └── run_config.py
│ ├── .env
│ ├── __init__.py
│ └── main.py
├── .gitignore
├── poetry.lock
└── pyproject.toml
I would like each flow in the flows/
folder to be independent of the central project and created as a separate docker container.
builder.py
at startup searches for all flows in flows/
folder, sets a specific configuration and registers them on the server.
But I ran into the problem of importing third-party packages. Let's say in the test_package1/
in requirements.txt
there is SQLAlchemy==1.4.34
. And in test_pack1/common/test_module.py
there is an import sqlalchemy
. And test_pack1/test_pack1_flow.py
have a @task
with function from test_module.py
. When the FlowBuilder class looks for a flow
variable in the file test_pack1_flow.py
it does this using the function flow = extract_flow_from_file(str(flow_module))
. At this step, a ModuleNotFoundError
error occurs, since there is no such dependency in the prefect central application(in pyproject.toml
). But when the docker container is created, after flow.register()
, of course it will already be there. How can I handle this step? Or maybe I'm doing something wrong?
I use Docker Storage, Docker Run and Local Executor.
This is a matter of packaging flow code dependencies, and it's all definitely doable. Since this was cross-posted on Prefect Discourse here, I responded in much more detail there.
Here is a short summary:
setup.py