How do I pin versioned dependencies in Python when using both conda and pip?

3k Views Asked by At

I'm trying to follow the best practice of installing fully pinned dependencies (for repeatable builds and better Docker caching, see this pythonspeed.com article).

My project needs to use both conda and pip (conda for complex ML packages, pip for stuff not available on conda). The conda-lock and pip-compile tools are able to generate all transitive dependencies at pinned versions. However, these tools are independent: when I run pip-compile, it's not aware of the dependencies that conda-lock wants to install, and vice versa.

This results in different package versions, causing wasted space in the Docker image and potentially causing breakage/incompatibility, as the pip install step installs different versions of some transitive dependencies.

Does anyone have a better solution for creating pinned Python dependency lists when using both conda and pip?

(Edit: here's a github ticket on conda-lock to support pip dependencies: https://github.com/conda-incubator/conda-lock/issues/4)

1

There are 1 best solutions below

2
On

Instead of using a tool that solves the depedencies, you could just install all the dependencies and then use conda env export to generate a pinned/versioned environment.yaml.

Main downside: this is heavier weight, as it actually has to install all the dependencies. On the upside, you end up with just a single environment "spec" environment file as input, and a single environment "lock" file as output.

Specify direct dependencies in environment-spec.yaml

Specify both conda and pip dependencies together. Example:

name: base
channels:
  - conda-forge
  - defaults
  # etc.
dependencies:
  - matplotlib
  - pandas
  - pip  # needed to have a pip section below
  - scikit-learn
  - pip:
    - pyplot_themes  # only available on PyPI

Install dependencies and export pinned versions (including transitive dependencies)

This could be done directly on your local machine, but here's how to isolate this process in Docker:

# syntax=docker/dockerfile:1

# Note: using miniconda instead of micromamba because micromamba lacks the
# `conda env export` command.
FROM continuumio/miniconda3:4.9.2

COPY environment-spec.yml /environment-spec.yml
# mounts are for conda caching and pip caching
RUN --mount=type=cache,target=/opt/conda/pkgs --mount=type=cache,target=/root/.cache \
    conda env create -n regen_env --file /environment-spec.yml

# Export dependencies.
RUN conda env export -n regen_env > /environment-lock-raw.yml
CMD ["cat", "/environment-lock.yml"]

Then you can create a pinned environment file like so (assuming the above dockerfile was named regen_environment.Dockerfile):

docker build -t regen_env -f regen_enviroment.Dockerfile .
docker run --rm regen_env > environment-lock.yaml

This outputs the pinned enviroment file to environment-lock.yaml, which you can then install with conda install -f environment-lock.yaml.

(Here's a gist with some more references and details: https://gist.github.com/jli/b2d2d62ad44b7fcb5101502c08dca1ae)