I'm trying to build an image base on R-base, following the multi stage method. How can I copy the installed packages from the 1ste stage into the 2nd stage? And nothing else?

The current file gives me basically a 'packageless' R-base version. So the packages installed in the 1ste stage are 'lost' somewhere.

I think it has something to do with making and choosing the correct directories. This is a confusing part for me, since I'm fairly new to dockerizing applications.

Thanks for all your help!

Below my current file:

# Base image
FROM rocker/r-base:latest AS stage1

## install binary, build and dependend packages
RUN apt-get update && apt-get install -y -qq --no-install-recommends --purge \
r-cran-pdftools \
r-cran-dplyr \
r-cran-stringr \
libxml2-dev \
libssl-dev && \
echo "r <- getOption('repos');r['CRAN'] <- 'http://cran.us.r-project.org'; options(repos = r);" > ~/.Rprofile && \
Rscript -e "install.packages(c('AzureStor'))"

##2nd stage, pulling 'fresh' base image
FROM rocker/r-base:latest

#COPY packages from 1st stage
COPY --from=stage1 /usr/local/lib/R/site-library /usr/local/lib/R/site-library

## create directories
RUN mkdir -p /script \

#Copy scripts
COPY /script /script

## Set workdir
WORKDIR /script
1

There are 1 best solutions below

1
On BEST ANSWER

In the comments you note that you want to get rid of any excess 'weight'. The latter typically comes from having development tools and packages installed. Now the rocker/r-base image brings in quite a bit of weight already, since it has r-base-devel with its dependencies installed. However, we can try to not add further weight by having only the run-time dependencies in the final image by getting rid of the build-time dependencies. Build-time dependencies that are not necessary at run-time for an R package are typically development files like header files for system libraries, e.g. you don't need the libxml2-dev package at run-time. The libxml2 package would be enough. I see several possible approaches to this.

First, you could use binary packages for those packages that need compilation against system libraries. I have not checked the dependencies for AzureStor, but it might well be that all the required R packages exist as compiled Debian packages. These will only depend on the run-time dependencies keeping the images size small and the build time short. Your Dockerfile would look something like this:

FROM rocker/r-base:latest

## install binary, build and dependend packages
RUN apt-get update && apt-get install -y -qq --no-install-recommends --purge \
    r-cran-pdftools \
    r-cran-dplyr \
    r-cran-stringr \
    r-cran-... \
    r-cran-... && \
    Rscript -e "install.packages(c('AzureStor'))" && \
    apt-get clean %% \
    rm -rf /var/lib/apt/lists/* && \
    rm -rf /tmp/*

## create directories
RUN mkdir -p /script 

#Copy scripts
COPY /script /script

## Set workdir
WORKDIR /script

Second, you could install both build- and run-time dependencies before installing R packages from source and remove the build-time dependencies after it, all within one command:

FROM rocker/r-base:latest

## install binary, build and dependend packages
RUN apt-get update && apt-get install -y -qq --no-install-recommends --purge \
    r-cran-pdftools \
    r-cran-dplyr \
    r-cran-stringr \
    libxml2-dev libxml2 \
    libssl-dev libssl1.1 && \
    Rscript -e "install.packages(c('AzureStor'))" && \
    apt-get purge --yes libxml2-dev libssl-dev && \
    apt-get clean %% \
    rm -rf /var/lib/apt/lists/* && \
    rm -rf /tmp/*


## create directories
RUN mkdir -p /script 

#Copy scripts
COPY /script /script

## Set workdir
WORKDIR /script

Finally, you could use a multistage build with three stages:

  1. Add the run-time dependencies.
  2. Add the build-time dependencies and install packages into /usr/local/lib/R/site-library.
  3. Use 1. as base and add the packages from 2.

So something like this:

# Base image
FROM rocker/r-base:latest AS stage1

## install binary, build and dependend packages
RUN apt-get update && apt-get install -y -qq --no-install-recommends --purge \
r-cran-pdftools \
r-cran-dplyr \
r-cran-stringr \
libxml2 \
libssl1.1 && \
apt-get clean %% \
rm -rf /var/lib/apt/lists/* && \
rm -rf /tmp/*

FROM stage1 AS stage2
RUN apt-get update && apt-get install -y -qq --no-install-recommends --purge \
libxml2-dev \
libssl-dev && \
Rscript -e "install.packages(c('AzureStor'))"


FROM stage1

COPY --from=stage2 /usr/local/lib/R/site-library /usr/local/lib/R/site-library

## create directories
RUN mkdir -p /script \

#Copy scripts
COPY /script /script

## Set workdir
WORKDIR /script

I have personally used the first and second approach. I have not tested the third approach, by I expect it to work as well.