Connection error when using langchain_community.vectorstores.faiss.FAISS

109 Views Asked by At

I found a methodology to use RAG (retrieval-augmented generation) with a Large Language Model (LLM) to answer question about a provided transcriptions. Here is the GitHub link: https://github.com/ingridstevens/whisper-audio-transcriber

I tried to implement it for my use case but I ran into trouble when it comes connecting to the Meta API FAISS embedded by the Langchain module.

Here is my code (I'm working on Google Colab):

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OllamaEmbeddings
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.chains import LLMChain
from langchain.llms import Ollama

audio = "./gdrive/MyDrive/test_long.m4a"
segments, info = model.transcribe(
    audio,
    beam_size=8,
    vad_filter=True,
    vad_parameters=dict(min_silence_duration_ms=100),
)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

text = []
for segment in segments:
  text.append(segment.text)
  print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

transcription = ''

for sentence in text:
  transcription += sentence

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
)

texts = splitter.split_text(transcription)

# Define the embeddings
embeddings = OllamaEmbeddings()
# Create the vector store using the texts and embeddings and put it in a vector database
docsearch = FAISS.from_texts(texts, embeddings, metadatas=[{"file": audio,"source": str(i)} for i in range(len(texts))])

And here is the error occurring:

---------------------------------------------------------------------------
ConnectionRefusedError                    Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/urllib3/connection.py in _new_conn(self)
    202         try:
--> 203             sock = connection.create_connection(
    204                 (self._dns_host, self.port),

24 frames
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

NewConnectionError                        Traceback (most recent call last)
NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7b9d21e23970>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

MaxRetryError                             Traceback (most recent call last)
MaxRetryError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7b9d21e23970>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7b9d21e23970>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/langchain_community/embeddings/ollama.py in _process_emb_response(self, input)
    161             )
    162         except requests.exceptions.RequestException as e:
--> 163             raise ValueError(f"Error raised by inference endpoint: {e}")
    164 
    165         if res.status_code != 200:

ValueError: Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7b9d21e23970>: Failed to establish a new connection: [Errno 111] Connection refused'))

I also have tried without the "metadatas" parameter but nothing changed:

docsearch = FAISS.from_texts(texts, embeddings)

Does someone know where the issue comes from and how to fix it?

0

There are 0 best solutions below