How can I use mamba/conda packages in rstudio server

1.1k Views Asked by At

I am trying to write to dockerfile to install conda environments from a yaml and to run rstudio server from within my project directories. I am aware of rstudio-server-conda but would like to be able to use a single dockerfile to create an image.

Dockerfile

FROM rocker/rstudio-stable:devel

# Set working directory
WORKDIR ${HOME}

# Copy directory files to image 
COPY . ${HOME}

# Copy repo into ${HOME}, make user own $HOME
USER root

# Install base utilities
RUN apt-get update && \
    apt-get install -y wget && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Install miniconda
ENV CONDA_DIR /opt/conda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
    -O ~/miniconda.sh && \
     /bin/bash ~/miniconda.sh -b -p /opt/conda && \
     

# Put conda in path so we can use conda activate
ENV PATH=$CONDA_DIR/bin:$PATH

# Install umamba
RUN conda install -y micromamba -c conda-forge

# Create a conda environment from the environment yml
COPY --chown=$MAMBA_USER:$MAMBA_USER environment.yml /tmp/environment.yml
RUN micromamba create --yes --file /tmp/environment.yml && \
    micromamba  clean --all --yes

# Activate the conda environment
ARG MAMBA_DOCKERFILE_ACTIVATE=1 

RUN chown -R ${NB_USER} . ${HOME}
USER ${NB_USER}

# Settings required for conda+rstudio
ENV RSTUDIO_WHICH_R=${CONDAENV}/bin/R
ENV RETICULATE_PYTHON=${CONDAENV}/bin/python

RUN echo rsession-which-r=${RSTUDIO_WHICH_R} > /etc/rstudio/rserver.conf && \
    echo rsession-ld-library-path=${CONDAENV}/lib >> /etc/rstudio/rserver.conf && \
    echo "R_LIBS_USER=${CONDAENV}/lib/R/library" > /home/rstudio/.Renviron

## Run an install.R script, if it exists.
#RUN if [ -f /R/install.R ]; then R --quiet -f /R/install.R; fi

environment.yml

channels:
  - conda-forge
dependencies:
  - r-devtools=2.4.3=r41hc72bb7e_0
  - r-tidyverse=1.3.1=r41hc72bb7e_0

The dockerfile installs rstudio-server and micromamba however when I attempt the read in the packages imported in the environment yaml, they are not found.

# build image
docker build --tag umamba-rstudio -f Dockerfile .
docker run --rm \
    -e ENV_NAME=environment \
    --mount type=bind,source="$(pwd)",destination=/home/rstudio \
    -p 127.0.0.1:8787:8787 \
    -e DISABLE_AUTH=true \
    umamba-rstudio
1

There are 1 best solutions below

0
On

I will cover two approaches:

  1. Making minimal changes to your existing files
  2. Larger changes that result in a improved Dockerfile and image

Here is the Dockerfile with minimal changes, where the 3 changed lines have # DIFF: ... comments appended to them:

FROM rocker/rstudio-stable:devel

# Set working directory
WORKDIR ${HOME}

# Copy directory files to image 
COPY . ${HOME}

# Copy repo into ${HOME}, make user own $HOME
USER root

# Install base utilities
RUN apt-get update && \
    apt-get install -y wget && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Install miniconda
ENV CONDA_DIR /opt/conda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
    -O ~/miniconda.sh && \
     /bin/bash ~/miniconda.sh -b -p /opt/conda # DIFF: removed " && \" from end of line
     

# Put conda in path so we can use conda activate
ENV PATH=$CONDA_DIR/bin:$PATH

# Install umamba
RUN conda install -y micromamba -c conda-forge

# Create a conda environment from the environment yml
ENV MAMBA_ROOT_PREFIX="/opt/conda" # DIFF: added this line
COPY --chown=$MAMBA_USER:$MAMBA_USER environment.yml /tmp/environment.yml
RUN micromamba install --prefix "$MAMBA_ROOT_PREFIX" --yes --file /tmp/environment.yml && \ # DIFF: added --prefix flag and argument
    micromamba  clean --all --yes

# Activate the conda environment
ARG MAMBA_DOCKERFILE_ACTIVATE=1 

RUN chown -R ${NB_USER} . ${HOME}
USER ${NB_USER}

# Settings required for conda+rstudio
ENV RSTUDIO_WHICH_R=${CONDAENV}/bin/R
ENV RETICULATE_PYTHON=${CONDAENV}/bin/python

RUN echo rsession-which-r=${RSTUDIO_WHICH_R} > /etc/rstudio/rserver.conf && \
    echo rsession-ld-library-path=${CONDAENV}/lib >> /etc/rstudio/rserver.conf && \
    echo "R_LIBS_USER=${CONDAENV}/lib/R/library" > /home/rstudio/.Renviron

## Run an install.R script, if it exists.
#RUN if [ -f /R/install.R ]; then R --quiet -f /R/install.R; fi

But I'd recommend making more substantial changes to make the Dockerfile easier to maintain, decrease build time, and create smaller images. This Dockerfile is largely based on the instructions from mamba-org/micromamba-docker on adding micromamba to an existing Docker image (disclosure: I maintain mamba-org/micromamba-docker).

# bring in the micromamba image so we can copy files from it
FROM mambaorg/micromamba:0.24.0 as micromamba

# This is the image we are going add micromaba to:
FROM rocker/rstudio-stable:devel

ARG MAMBA_USER=root
ARG MAMBA_USER_ID=0
ARG MAMBA_USER_GID=0
ENV MAMBA_USER=$MAMBA_USER
ENV MAMBA_ROOT_PREFIX="/opt/conda"
ENV MAMBA_EXE="/bin/micromamba"

COPY --from=micromamba "$MAMBA_EXE" "$MAMBA_EXE"
COPY --from=micromamba /usr/local/bin/_activate_current_env.sh /usr/local/bin/_activate_current_env.sh
COPY --from=micromamba /usr/local/bin/_dockerfile_shell.sh /usr/local/bin/_dockerfile_shell.sh
COPY --from=micromamba /usr/local/bin/_entrypoint.sh /usr/local/bin/_entrypoint.sh
COPY --from=micromamba /usr/local/bin/_activate_current_env.sh /usr/local/bin/_activate_current_env.sh
COPY --from=micromamba /usr/local/bin/_dockerfile_initialize_user_accounts.sh /usr/local/bin/_dockerfile_initialize_user_accounts.sh
COPY --from=micromamba /usr/local/bin/_dockerfile_setup_root_prefix.sh /usr/local/bin/_dockerfile_setup_root_prefix.sh

RUN /usr/local/bin/_dockerfile_initialize_user_accounts.sh && \
    /usr/local/bin/_dockerfile_setup_root_prefix.sh && \
    echo rsession-which-r=${RSTUDIO_WHICH_R} > /etc/rstudio/rserver.conf && \
    echo rsession-ld-library-path=${CONDAENV}/lib >> /etc/rstudio/rserver.conf && \
    echo "R_LIBS_USER=${CONDAENV}/lib/R/library" > /home/rstudio/.Renviron

SHELL ["/usr/local/bin/_dockerfile_shell.sh"]

ENTRYPOINT ["/usr/local/bin/_entrypoint.sh"]

# populate the "base" conda environment:
USER $MAMBA_USER
COPY --chown=$MAMBA_USER:$MAMBA_USER environment.yml /tmp/environment.yml
RUN micromamba install --yes --file /tmp/environment.yml && \
    micromamba  clean --all --yes

WORKDIR ${HOME}

# Copy directory files to image 
COPY --chown=$MAMBA_USER_ID:$MAMBA_USER_GID . ${HOME}

# Settings required for conda+rstudio
ENV RSTUDIO_WHICH_R=${CONDAENV}/bin/R
ENV RETICULATE_PYTHON=${CONDAENV}/bin/python

CMD ["/init"]