running tika-python in docker container offline

448 Views Asked by At

I have a web app which uses tika-python, it works fine and each time I start it, it downloads two files "tika-server.jar" and "tika-server.jar" to local and parses files. But sometimes its unable to download those files so this service doesn't work at all.

I have downloaded both files to ./temp and want use those files and don't want to download again and again which takes a lot of times and sometimes doesn't work.

I have tried docker compose but thats also not working, so far my docker file

FROM python:3.8-slim
WORKDIR /app
COPY ./templates /app/templates
COPY ./temp /app/temp
COPY ./app.py /app/app.py
COPY ./requirements.txt /app/requirements.txt
RUN pip install --no-deps --no-cache-dir -r requirements.txt && \
    apt-get update && \
    DEBIAN_FRONTEND=noninteractive \
    apt-get -y install default-jre-headless && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*
#ENV
ENV TIKA_SERVER_JAR = ./temp/tika-server.jar
ENV TIKA_PATH = ./temp
#PORT
EXPOSE 5000

# configure the container to run in an executed manner
ENTRYPOINT [ "python", "app.py" ]

My python app.py script

import os
from tika import parser
os.environ['TIKA_SERVER_JAR'] = './temp/tika-server.jar'
os.environ['TIKA_PATH'] = './temp'
text = parser.from_file(file, service='text')['content']

everything works when I don't want to use this offline but when I want to use local files nothing works. I have tried different combination of env variables. I am new to docker and linux commands.

Any help will be appreciated.

User's Environment variable: {'GPG_KEY': 'E3FF2839C048B25C084DEBE9B2*************68', 'HOME': '/root', 'HOSTNAME': '2d43d*****', 'LANG': 'C.UTF-8', 'PATH': '/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin', 'PYTHON_GET_PIP_SHA256': '5aefe6ade911d997af080b315ebcb7f882212d070465df544e1175ac2be519b4', 'PYTHON_GET_PIP_URL': 'https://github.com/pypa/get-pip/raw/5eaac1050023df1f5c98b173b248c260023f2278/public/get-pip.py', 'PYTHON_PIP_VERSION': '22.0.4', 'PYTHON_SETUPTOOLS_VERSION': '57.5.0', 'PYTHON_VERSION': '3.8.13', 'TIKA_PATH': './my_project', 'TIKA_SERVER_JAR': './my_project/tika-server.jar'}

0

There are 0 best solutions below