Langchain and pinecone upserting documents using streamlit

143 Views Asked by user23052550 At 27 July 2025 at 18:47

st.header("MEDBOT")
st.write("---")
uploaded_files = st.file_uploader("Upload documents",accept_multiple_files=True, type=["txt","pdf"])
st.write("---")
if uploaded_files is None:
  st.info("""Upload files to analyse""")
elif uploaded_files:
  st.write(str(len(uploaded_files)) + " document(s) loaded..")
  textify_output = read_and_textify(uploaded_files)
  
  #sources = textify_output[1]
  #st.write(textify_output[0],textify_output[1])

def split_docs(uploaded_files,chunk_size=1000,chunk_overlap=0):
  text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
  docs = text_splitter.create_documents(str(uploaded_files))
  return docs

data = split_docs(uploaded_files)
embeddings = OpenAIEmbeddings(model_name="text-embedding-ada-002",openai_api_key="" )

if 'responses' not in st.session_state:
    st.session_state['responses'] = ["How can I assist you?"]

if 'requests' not in st.session_state:
    st.session_state['requests'] = []

if 'buffer_memory' not in st.session_state:
            st.session_state.buffer_memory=ConversationBufferWindowMemory(k=3,return_messages=True)

llm = ChatOpenAI(model_name="gpt-3.5-turbo-1106", openai_api_key="")
vectorstore = Pinecone.from_documents(documents=data, embedding=embeddings, index_name="chatbot")
retrieve=vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

system_msg_template = SystemMessagePromptTemplate.from_template(template="""You are a virtual clinic coordinator for Advanced Surgeons.  Greet people politely who say hi or hello. When you know the answer, give the source document and page number you have taken the information from. Also give a link to the source document. DO NOT PERFORM AN INTERNET SEARCH. DO NOT ACCESS THE TRAINING DATA. If you do not know the answer to their question or have no information, do not guess but say this exactly: " I don't know the answer to the question. Click on the Call Us link to be connected to a coordinator " followed by a clickable link for tel:9142827802 with caption "Call us""")
human_msg_template = HumanMessagePromptTemplate.from_template(template="{input}")
prompt_template = ChatPromptTemplate.from_messages([system_msg_template, MessagesPlaceholder(variable_name="history"), human_msg_template])
conversation = ConversationalRetrievalChain.from_llm(llm,retriever=retrieve, chain_type="stuff",verbose=True)

responsecontainer = st.container()
textcontainer = st.container()
   
with textcontainer:
    query = st.text_input("Query: ", key="input")
    if query:
        with st.spinner("typing..."):
            conversation_string = get_conversation_string()
            st.code(conversation_string)
            refined_query = query_refiner(conversation_string, query)
            st.subheader("Refined Query:")
            st.write(refined_query)
            context = find_match(refined_query)
            print(context)  
            response = conversation.predict(input=f"Context:\n {context} \n\n Query:\n{query}")
        st.session_state.requests.append(query)
        st.session_state.responses.append(response) 
with responsecontainer:
    if st.session_state['responses']:

        for i in range(len(st.session_state['responses'])):
            message(st.session_state['responses'][i],key=str(i))
            if i < len(st.session_state['requests']):
                message(st.session_state["requests"][i], is_user=True,key=str(i)+ '_user')

i am trying to create a chatbot with an upload panel to directly upload data, which stores inside a pinecone database, and the chatbot answers questions only with the knowledge from the data. I am trying a lot of different things, and have some specific requirements, but there are multiple errors occurring.

This error occurs when i open the app using streamlit run main.py, and shows up where the chatbot needs to be on the webpage.

document_variable_name context was not found in llm_chain input_variables: ['history', 'input'] (type=value_error),

After i actually add the document, it starts to tell me that it needs to be in strings. Or it says that the uploaded_doc does not have attribute 'Page_content'.

Original Q&A

Langchain and pinecone upserting documents using streamlit

There are 0 best solutions below

Related Questions in OPENAI-API

Related Questions in STREAMLIT

Related Questions in LANGCHAIN

Related Questions in PINECONE

Trending Questions

Popular # Hahtags

Popular Questions