I am unable to use reconstructed index as a numpy array. The reconstruction itself succeeds, but then when I try to create new index it fails with system error message.
import faiss
import numpy as np
d = 768
ncentroids = 15
niter = 2
faiss_index = faiss.IndexFlatL2(d)
x0 = np.random.random( (5000, d) )
faiss_index.add(x0)
x = faiss_index.reconstruct_n()
kmeans_index = faiss.Kmeans(d, ncentroids, niter=niter, verbose=True)
kmeans_index.train(x)
This code causes the following Error message:
Sampling a subset of 3840 / 5000 for training Clustering 3840 points in 768D to 15 clusters, redo 1 times, 2 iterations Preprocessing in 0.01 s
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
I suspect that the issue is connected with different size of float (float32, float64) numbers on MacOS vs other operating systems. Is there some work around here? Maybe to reserve memory to store numpy array in advance?
Platform specs:
faiss-cpu == 1.8.0
Python 3.11.6 (v3.11.6:8b6ee5ba3b, Oct 2 2023, 11:18:21) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Model Name: MacBook Pro
Model Identifier: Mac14,9
Model Number: MPHE3LL/A
Chip: Apple M2 Pro
Total Number of Cores: 10 (6 performance and 4 efficiency)
Memory: 16 GB
System Firmware Version: 10151.81.1
OS Loader Version: 10151.81.1
I found the mistake. In my original code I import and activate another python package which creates conflict.
In particular, adding this code will produce the mistake described in my question:
But it is still unclear why this error ever happens.