Problems With Python and Ollama

Question

Problems With Python and Ollama

869 Views Asked by safernandez666 At 26 November 2025 at 05:56

How are you doing?

I'm using Python 3.11.7 on a Mac M2. I have this list of dependencies in a venv.

I'm having problems with Ollama. I test locally and dockerized. The error would not be helping me figure out what it could be.

The idea is to load an HTML and be able to query it, in that context.

This is the code:

from langchain_community.llms import Ollama
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import Chroma
from langchain_community import embeddings
from langchain_community.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain.output_parsers import PydanticOutputParser
from langchain.text_splitter import CharacterTextSplitter

model_local = Ollama(base_url="http://192.168.0.200:11434", model="mistral")

# 1. Split data into chunks
urls = [
    "https://es.wikipedia.org/wiki/The_A-Team",
]
docs = [WebBaseLoader(url).load() for url in urls]
docs_list = [item for sublist in docs for item in sublist]
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=7500, chunk_overlap=100)
doc_splits = text_splitter.split_documents(docs_list)

# 2. Convert documents to Embeddings and store them
vectorstore = Chroma.from_documents(
    documents=doc_splits,
    collection_name="rag-chroma",
    embedding=embeddings.ollama.OllamaEmbeddings(model='nomic-embed-text'),
)
retriever = vectorstore.as_retriever()

# 3. Before RAG
print("Before RAG\n")
before_rag_template = "What is {topic}"
before_rag_prompt = ChatPromptTemplate.from_template(before_rag_template)
before_rag_chain = before_rag_prompt | model_local | StrOutputParser()
print(before_rag_chain.invoke({"topic": "Ollama"}))

# 4. After RAG
print("\n########\nAfter RAG\n")
after_rag_template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
after_rag_prompt = ChatPromptTemplate.from_template(after_rag_template)
after_rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | after_rag_prompt
    | model_local
    | StrOutputParser()
)
print(after_rag_chain.invoke("Quien integra Brigada A?"))

On the right I list the models and on the left I consult if Ollama responds.

Error

cd /Users/santiago/Proyects/OllamaURL ; /usr/bin/env /Users/santiago/Proyects/OllamaURL/env/bin/python /Users/santiago/.vscode/extensions/ms-python.debugpy-2024.2.0-darwin-arm64/bundled/libs/debugpy/adapter/../../debugpy/launcher 54637 -- /Users/santiago/Proyects/OllamaURL/rag.py
Traceback (most recent call last):
  File "/Users/santiago/Proyects/OllamaURL/rag.py", line 25, in <module>
    vectorstore = Chroma.from_documents(
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/santiago/Proyects/OllamaURL/env/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 778, in from_documents
    return cls.from_texts(
           ^^^^^^^^^^^^^^^
  File "/Users/santiago/Proyects/OllamaURL/env/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 736, in from_texts
    chroma_collection.add_texts(
  File "/Users/santiago/Proyects/OllamaURL/env/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 275, in add_texts
    embeddings = self._embedding_function.embed_documents(texts)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/santiago/Proyects/OllamaURL/env/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 204, in embed_documents
    embeddings = self._embed(instruction_pairs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/santiago/Proyects/OllamaURL/env/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 192, in _embed
    return [self._process_emb_response(prompt) for prompt in iter_]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/santiago/Proyects/OllamaURL/env/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 192, in <listcomp>
    return [self._process_emb_response(prompt) for prompt in iter_]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/santiago/Proyects/OllamaURL/env/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 166, in _process_emb_response
    raise ValueError(
ValueError: Error raised by inference API HTTP code: 404, {"error":"model 'nomic-embed-text' not found, try pulling it first"}

I would like you to guide me to solve this error.

Original Q&A

There are 2 best solutions below

**Andrew Nguonly** · Answer 1

The default base_url for OllamaEmbeddings is http://localhost:11434. Set the base_url to http://192.168.0.200:11434.

vectorstore = Chroma.from_documents(
    documents=doc_splits,
    collection_name="rag-chroma",
    embedding=embeddings.ollama.OllamaEmbeddings(
        base_url='http://192.168.0.200:11434',
        model='nomic-embed-text'
    ),
)

References

OllamaEmbeddings (LangChain GitHub)

**j3ffyang** · Answer 2

I tweak your code and re-run it successfully. Here's the code

from langchain.text_splitter import CharacterTextSplitter
from langchain.schema.document import Document
text_splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=20)
text = "I am going to tell you a story about Tintin."
docs = [Document(page_content=x) for x in text_splitter.split_text(text)]


from langchain_community.vectorstores import Chroma
from langchain_community.llms import Ollama
from langchain_community import embeddings
persist_directory = "/tmp/chromadb"
vectorstore = Chroma.from_documents(
    documents=docs,
    collection_name="test",
    embedding=embeddings.ollama.OllamaEmbeddings(model='nomic-embed-text')
)
retriever = vectorstore.as_retriever()


from langchain_community.llms import Ollama
llm = Ollama(model="mistral")


from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)
print(rag_chain.invoke("tell me a story"))

We can look at the error message you provided

ValueError: Error raised by inference API HTTP code: 404, {"error":"model 'nomic-embed-text' not found, try pulling it first"}

Are you sure such embedding model has been pulled by Ollama? I pulled it by running

ollama pull nomic-embed-text

Ollama embedding reference > https://python.langchain.com/docs/integrations/text_embedding/ollama

Problems With Python and Ollama

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in LANGCHAIN

Related Questions in PY-LANGCHAIN

Related Questions in OPENAIEMBEDDINGS

Related Questions in OLLAMA

Trending Questions

Popular # Hahtags

Popular Questions