How do I install pytorch in a Docker container without blowing up memory?

657 Views Asked by At

I am trying to create my own LLM by indexing the files with opensearch and LLamaV2 following this project.

But I have a problem. I get the impression that it's asking for tensorflow, pytorch and Flex at the same time, and I'm afraid that's going to bust the Docker image I'm trying to create.

Here I launch the opensearch/gradio:

(venv) (base) remplacement@remplacements-MacBook-Pro document-qa-webui % docker run -p 7860:7860 -e HUGGINGFACEHUB_API_TOKEN='mytoken' -e OPENSEARCH_URL='https://admin:admin@localhost:9200'  document-qa-webui
/usr/local/lib/python3.9/site-packages/langchain/__init__.py:40: UserWarning: Importing PromptTemplate from langchain root module is no longer supported.
  warnings.warn(
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
/usr/local/lib/python3.9/site-packages/huggingface_hub/utils/_deprecation.py:127: FutureWarning: '__init__' (from 'huggingface_hub.inference_api') is deprecated and will be removed from version '0.19.0'. `InferenceApi` client is deprecated in favor of the more feature-complete `InferenceClient`. Check out this guide to learn how to convert your script to use it: https://huggingface.co/docs/huggingface_hub/guides/inference#legacy-inferenceapi-client.
  warnings.warn(warning_message, FutureWarning)
You're using a different task than the one specified in the repository. Be sure to know what you're doing :)
/usr/local/lib/python3.9/site-packages/huggingface_hub/file_download.py:979: UserWarning: Not enough free disk space to download the file. The expected file size is: 0.00 MB. The target location /root/.cache/huggingface/hub only has 0.00 MB free disk space.
  warnings.warn(
/usr/local/lib/python3.9/site-packages/huggingface_hub/file_download.py:979: UserWarning: Not enough free disk space to download the file. The expected file size is: 0.00 MB. The target location /root/.cache/huggingface/hub/models--OpenAssistant--oasst-sft-4-pythia-12b-epoch-3.5/blobs only has 0.00 MB free disk space.
  warnings.warn(
Downloading (…)okenizer_config.json: 100%|██████████| 444/444 [00:00<00:00, 19.0kB/s]
/usr/local/lib/python3.9/site-packages/huggingface_hub/file_download.py:979: UserWarning: Not enough free disk space to download the file. The expected file size is: 2.11 MB. The target location /root/.cache/huggingface/hub only has 0.00 MB free disk space.
  warnings.warn(
/usr/local/lib/python3.9/site-packages/huggingface_hub/file_download.py:979: UserWarning: Not enough free disk space to download the file. The expected file size is: 2.11 MB. The target location /root/.cache/huggingface/hub/models--OpenAssistant--oasst-sft-4-pythia-12b-epoch-3.5/blobs only has 0.00 MB free disk space.
  warnings.warn(
Downloading (…)/main/tokenizer.json: 100%|██████████| 2.11M/2.11M [00:01<00:00, 1.13MB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 303/303 [00:00<00:00, 963kB/s]

Then it seems that PyTorch, TensorFlow >= 2.0, or Flax (None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models will not be available and only tokenizers, configuration, and file/data utilities can be used). They are indeed missing from the Dockerfile`:

# Use an official Python base image
#FROM python:3.9-alpine
FROM python:3.9-slim

#RUN apk update && apk add python3-dev gcc libc-dev libffi-dev g++ make rust cargo

RUN pip3 install --upgrade pip
RUN pip3 install langchain
RUN pip3 install pypdf
RUN pip3 install unstructured
RUN pip3 install gradio
#RUN pip3 install unstructured[local-inference]
RUN pip3 install opensearch-py
RUN pip3 install transformers
RUN pip3 install pdf2image
RUN pip3 install tabulate

# Copy files
COPY gradio-opensearch.py ./
COPY backend.py ./
COPY file-loaded.txt ./

EXPOSE 7860

ARG HUGGINGFACEHUB_API_TOKEN='None'
ARG OPENSEARCH_URL='None'

ENV HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN
ENV OPENSEARCH_URL=$OPENSEARCH_URL

CMD ["python", "gradio-opensearch.py"]

I tried to add the line RUN pip3 install tensorflow but I get:

#0 4.240   Downloading pyasn1-0.5.0-py2.py3-none-any.whl (83 kB)
#0 4.258      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 83.9/83.9 kB 4.7 MB/s eta 0:00:00
#0 4.318 Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.7.0->google-auth-oauthlib<1.1,>=0.5->tensorboard<2.15,>=2.14->tensorflow-cpu-aws==2.14.0->tensorflow)
#0 4.340   Downloading oauthlib-3.2.2-py3-none-any.whl (151 kB)
#0 4.360      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 151.7/151.7 kB 8.8 MB/s eta 0:00:00
#0 4.425 Downloading tensorflow-2.14.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.1 kB)
#0 4.458 Downloading tensorflow_cpu_aws-2.14.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (262.8 MB)
#0 29.63 ERROR: Could not install packages due to an OSError: [Errno 28] No space left on device
#0 29.63 
#0 29.64    ━━━━━━━━━━━━━━━                          101.3/262.8 MB 3.3 MB/s eta 0:00:49
------
Dockerfile:17
--------------------
  15 |     RUN pip3 install pdf2image
  16 |     RUN pip3 install tabulate
  17 | >>> RUN pip3 install tensorflow
  18 |     
  19 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c pip3 install tensorflow" did not complete successfully: exit code: 1

So I tried pytorch but it seemed that I need all the temsorflow, pytorch and Flex libraries from what I read in the warning:

(venv) (base) remplacement@remplacements-MacBook-Pro document-qa-webui % gradio gradio-opensearch.py

/Users/remplacement/Documents/Work/document-qa-webui/venv/lib/python3.8/site-packages/urllib3/__init__.py:34: NotOpenSSLWarning: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(

Warning: Cannot statically find a gradio demo called demo. Reload work may fail.
    ...
  warnings.warn(
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.

But it already takes more than 1700 segundos

(venv) (base) remplacement@remplacements-MacBook-Pro document-qa-webui % docker build -t document-qa-webui .
[+] Building 1736.3s (6/21)                                                                                                                                                 
 => => transferring dockerfile: 878B                                                                                                                                   0.0s
 => [internal] load metadata for docker.io/library/python:3.9-slim                                                                                                     4.1s
 => [auth] library/python:pull token for registry-1.docker.io                                                                                                          0.0s
 => [internal] load build context                                                                                                                                      0.0s
 => => transferring context: 104B                                                                                                                                      0.0s
 => CACHED [ 1/16] FROM docker.io/library/python:3.9-slim@sha256:d99e43ea163609b2af59d8ce07771dbb12c4b0d77b2c3c836261128ab0ac7394                                      0.0s
 => => resolve docker.io/library/python:3.9-slim@sha256:d99e43ea163609b2af59d8ce07771dbb12c4b0d77b2c3c836261128ab0ac7394                                               0.0s
 => [ 2/16] RUN pip3 install torch                                                                                                                                  1732.1s
0

There are 0 best solutions below