Run localGPT via pipenv instead of conda

297 Views Asked by At

Goal

I would like to use pipenv instead of conda to run localGPT on a Ubuntu 22.04.03 machine.

Reason: On the server where I would like to deploy localGPT pipenv is already installed, but conda isn't and I lack the permissions to install it.

Approach

I translated the existing, up-to-date requirements.txt file:

# Natural Language Processing
langchain==0.0.267
chromadb==0.4.6
pdfminer.six==20221105
InstructorEmbedding
sentence-transformers
faiss-cpu
huggingface_hub
transformers
protobuf==3.20.2; sys_platform != 'darwin'
protobuf==3.20.2; sys_platform == 'darwin' and platform_machine != 'arm64'
protobuf==3.20.3; sys_platform == 'darwin' and platform_machine == 'arm64'
auto-gptq==0.2.2
docx2txt
unstructured
unstructured[pdf]

# Utilities
urllib3==1.26.6
accelerate
bitsandbytes ; sys_platform != 'win32'
bitsandbytes-windows ; sys_platform == 'win32'
click
flask
requests

# Streamlit related
streamlit
Streamlit-extras

# Excel File Manipulation
openpyxl

to a Pipfile (located on root), which looks like this:

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[packages]
langchain = "==0.0.267"
chromadb = "==0.4.6"
pdfminer.six = "==20221105"
InstructorEmbedding = "*"
sentence-transformers = "*"
faiss-cpu = "*"
huggingface_hub = "*"
transformers = "*"
protobuf = "==3.20.3"
auto-gptq = "==0.2.2"
docx2txt = "*"
unstructured = {extras = ["pdf"], version = "*"}
urllib3 = "==1.26.6"
accelerate = "*"
bitsandbytes = "*"
click = "*"
flask = "*"
requests = "*"
streamlit = "*"
Streamlit-extras = "*"
openpyxl = "*"
jmespath = "==1.0.1"
llama-cpp-python = "==0.2.11"

[requires]
python_version = "3.10"

Mind that I added jmespath and llama-cpp-python, because, when I did it the conda way, I needed to additionally install these two packages via pip.

Problem

So, theoretically, running

pipenv install
pipenv shell
python ingest.py

should do the ingestion, but unfortunately I get an error :red_circle:

python ingest.py

2023-10-17 11:16:53,102 - INFO - ingest.py:121 - Loading documents from /home/*********/Documents/my-chatbot/SOURCE_DOCUMENTS
2023-10-17 11:16:53,131 - INFO - ingest.py:34 - Loading document batch
concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/concurrent/futures/process.py", line 246, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/home/*******/Documents/mylocal-chatbot/ingest.py", line 40, in load_document_batch
    data_list = [future.result() for future in futures]
  File "/home/*******/Documents/mylocal-chatbot/ingest.py", line 40, in <listcomp>
    data_list = [future.result() for future in futures]
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/*******/Documents/mylocal-chatbot/ingest.py", line 30, in load_single_document
    return loader.load()[0]
  File "/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/langchain/document_loaders/unstructured.py", line 86, in load
    elements = self._get_elements()
  File "/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/langchain/document_loaders/unstructured.py", line 169, in _get_elements
    from unstructured.partition.auto import partition
  File "/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/unstructured/partition/auto.py", line 80, in <module>
    from unstructured.partition.pdf import partition_pdf
  File "/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/unstructured/partition/pdf.py", line 12, in <module>
    from pdfminer.converter import PDFPageAggregator, PDFResourceManager
ImportError: cannot import name 'PDFResourceManager' from 'pdfminer.converter' (/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/pdfminer/converter.py)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/*******/Documents/mylocal-chatbot/ingest.py", line 159, in <module>
    main()
  File "/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/*******/Documents/mylocal-chatbot/ingest.py", line 122, in main
    documents = load_documents(SOURCE_DIRECTORY)
  File "/home/*******/Documents/mylocal-chatbot/ingest.py", line 71, in load_documents
    contents, _ = future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
ImportError: cannot import name 'PDFResourceManager' from 'pdfminer.converter' (/home/*******/.local/share/virtualenvs/mylocal-chatbot-j8q8_E0e/lib/python3.10/site-packages/pdfminer/converter.py)

Manually (re)installing the related packages via pipenv install pdfminer, pipenv install pdfminer.six or pipenv install unstructured did not help.

Any ideas how to get rid of this import error?

0

There are 0 best solutions below