In the official MLFlow docs, they suggest the reader to take a look at this repo with an example of how we can share a MLFlow Project in a docker environment. In this repo, they suggest we run:
mlflow run examples/docker -P alpha=0.5
However, looking at the logs, I see that what's run is:
docker run --rm -v /path/to/mlruns:/mlflow/tmp/mlruns -v /path/to/mlruns/0/da804b67a0af4fd68fc9d0f4a1d724b0/artifacts:/path/to/mlflow_exploration/mlruns/0/da804b67a0af4fd68fc9d0f4a1d724b0/artifacts -e MLFLOW_RUN_ID=da804b67a0af4fd68fc9d0f4a1d724b0 -e MLFLOW_TRACKING_URI=file:///mlflow/tmp/mlruns -e MLFLOW_EXPERIMENT_ID=0 mlflow-docker-wine:latest python3 /path/to/train.py --alpha 0.5
This already assumes that experiment id and run id, which makes it impossible to define a experiment and run name...
I've tried exporting the environment variables in the docker file:
RUN mkdir ./sharing_with_docker && export MLFLOW_EXPERIMENT_NAME="SHARING_WITH_DOCKER_EXPERIMENT" \
MLFLOW_RUN_ID="sharing_with_docker_run"
And also in the MLproject file:
environment: [
["MLFLOW_EXPERIMENT_NAME", "SHARING_WITH_DOCKER_EXPERIMENT"],
["MLFLOW_RUN_NAME", "sharing_with_docker_run"]
]
None of these options worked... I had to run:
docker run -e MLFLOW_EXPERIMENT_NAME="SHARING_WITH_DOCKER_EXPERIMENT" -e MLFLOW_RUN_NAME="sharing_with_docker" mlflow-docker-wine python3 my_docker_image/train.py --alpha 0.5
This works, but the problem is that I lose the control over the volumes... So, as soon as the python script is over, and we exit the container we loose all data. I could do docker exec and enter directly in the shell, but I'm trying to avoid that.
Is there a way to define either in the MLproject, or in the mlflow run
command, these variables?