I am trying to run GPT4All's embedding model on my M1 Macbook with the following code:
import json
import numpy as np
from gpt4all import GPT4All, Embed4All
# Load the cleaned JSON data
with open('coursesclean.json') as file:
data = json.load(file)
# Create an index and embeddings array
index = {}
embeddings = []
# Iterate over each course
for course_code, course_info in data.items():
course_name = course_info['course_name']
course_desc = course_info['course_desc']
text = f"{course_name} {course_desc}"
embedder = Embed4All()
embeddings_response = embedder.embed(text)
course_embeddings = np.array(embeddings_response)
# Store the embeddings in the array
embeddings.append(course_embeddings)
# Index the embeddings
index[course_code] = len(embeddings) - 1
# Convert the embeddings array to a NumPy array
embeddings_array = np.stack(embeddings)
save the index and embeddings array to NumPy files
np.save('embeddings.npy', embeddings_array)
with open('index.json', 'w') as file:
json.dump(index, file, indent=2)
However, I seem to be running into a Metal library conflict that puts me in an infinite loop of the Found model file statement.
Found model file at /Users/MY_USERNAME/.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin
objc[42338]: Class GGMLMetalClass is implemented in both /opt/homebrew/lib/python3.11/site-packages/gpt4all/llmodel_DO_NOT_MODIFY/build/libreplit-mainline-metal.dylib (0x11eb0c208) and /opt/homebrew/lib/python3.11/site-packages/gpt4all/llmodel_DO_NOT_MODIFY/build/libllamamodel-mainline-metal.dylib (0x11ef38208). One of the two will be used. Which one is undefined.
Found model file at /Users/MY_USERNAME//.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin
Found model file at /Users/MY_USERNAME//.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin
...
I've tried reinstalling GPT4All, cleared the model from cache and tried through Langchain but I'm having the same issue.
The issue is likely unrelated to your code. Check https://github.com/nomic-ai/gpt4all/issues/1233