I am trying to create a langchain model. I got OpenAI setup and embedded some data through URLS in a FAISS object. But I am unable to pickle the objects and getting an error saying that it contains '_thread.Rlock'. After I got to know that, it's because of the command FAISS.from_documents(). There is an issue of indexing while using this method. But I am unable to resolve this issue.

!pip install python-magic langchain unstructured streamlit openai tiktoken faiss-gpu

import os
import streamlit as st
import pickle
import time
from langchain import OpenAI
from langchain.chains import RetrievalQAWithSourcesChain
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import UnstructuredURLLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS

os.environ['OPENAI_API_KEY'] = "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"

llm = OpenAI(temperature = 0.9, max_tokens=500)

loader = UnstructuredURLLoader(
    urls = [
data = loader.load()


text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 1000,  # size of each chunk created
    chunk_overlap  = 200,  # size of  overlap between chunks in order to maintain the context
docs = text_splitter.split_documents(data)


# Create the embeddings of the chunks using openAIEmbeddings
embeddings = OpenAIEmbeddings()

# Pass the documents and embeddings inorder to create FAISS vector index
vectorindex_openai = FAISS.from_documents(docs, embeddings)

# Storing vector index create in local
with open(file_path, "wb") as f:
    pickle.dump(vectorindex_openai, f)

Error is:

TypeError                                 Traceback (most recent call last)
<ipython-input-74-15688820a1ef> in <cell line: 3>()
      2 file_path="vector_index.pkl"
      3 with open(file_path, "wb") as f:
----> 4     pickle.dump(vectorindex_openai, f)

TypeError: cannot pickle '_thread.RLock' object

I was trying to create a vector_index.pkl file


I had the same issue, I solved it with the code below, not use pickle.

vectorindex_openai = FAISS.from_documents(docs, embeddings)


run the code and then you can get a folder named "faiss_store", there are two files in the folder, "index.faiss" and "index.pkl"。 if you want use the stored data later, you can code by

FAISS.load_local("faiss_store", OpenAIEmbeddings())