Segmentation fault while using FAISS `index.search()` in FastAPI

1.2k Views Asked by At

I am trying to do an index search given a vector within a FAISS index that is saved on disk and is read into memory. This happens within a function that is called within an endpoint using FastAPI.

The endpoint and the function it is calling are the following:

# Endpoint
@app.get('/neighbours', response_model=Neighbours)
async def get_neighbours(case: str, ids: str, request: Request):
    """
       Retrieves a set of neighbors for a single id
    """
       id_list = ids.split(';')
       return Neighbours(result=get_neighbours_from_id(case, [int(id) for id in id_list]))
# Functions and objects
class Neighbour(BaseModel):
    id: int
    neighbours: List[int]


class Neighbours(BaseModel):
    result: List[Neighbour]


def get_neighbours_from_id(case, ids):
    
    case_path = validate_case_path(case)

    index = read_index(case_path.as_posix())
    # The above works well, the index is loaded properly.

    vec = index.reconstruct_n(0, 1)
    # This also works, I can see the vector being reconstructed.

    faulthandler.enable()
    faulthandler.dump_traceback()
    # Enabling faulthandler for the traceback

    d, i = index.search(vec, 1)
    # This is where the code crashes.

The code crashes when I am trying to do a search on the loaded index given a reconstructed vector, with the following traceback:

Current thread 0x000000010781adc0 (most recent call first):
  File "/../api/app/vectors.py", line 37 in get_neighbours_from_id
  File ".../api/service.py", line 87 in get_neighbours
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/fastapi/routing.py", line 160 in run_endpoint_function
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/fastapi/routing.py", line 227 in app
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/starlette/routing.py", line 61 in app
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/starlette/routing.py", line 259 in handle
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/starlette/routing.py", line 656 in __call__
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 18 in __call__
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/starlette/exceptions.py", line 71 in __call__
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/starlette/middleware/cors.py", line 84 in __call__
  File ".../pypoetry/virtualenvs/env1-py3.8/lib/python3.8/site-packages/starlette/middleware/base.py", line 34 in coro
  File ".../anaconda3/lib/python3.8/asyncio/events.py", line 81 in _run
  File ".../anaconda3/lib/python3.8/asyncio/base_events.py", line 1859 in _run_once
  File ".../anaconda3/lib/python3.8/asyncio/base_events.py", line 570 in run_forever
  File ".../anaconda3/lib/python3.8/asyncio/base_events.py", line 603 in run_until_complete
  File ".../anaconda3/lib/python3.8/asyncio/runners.py", line 43 in run
  File ".../run_service.py", line 19 in <module>
Fatal Python error: Segmentation faultFatal Python error: Fatal Python error: 

Segmentation faultFatal Python error: 
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

The traceback makes me think that it is something related to asyncio or multithreading within FastAPI?

Important to note that the exact same code (index loading, vector reconstruction and search) all work without any problems in the following minimal example within the same environment:

import faiss

index = faiss.read_index("my_index.index")

vec = index.reconstruct_n(0, 1)

d, i = index.search(vec, 1)

print(i)
# Works properly.

I have looked at this issue but it would seem like it is fixed in the newer versions of FAISS, and that is what I am using.

Thank you very much for the help!

1

There are 1 best solutions below

0
On

Segmentation faults are usually due to hardware/memory issues. I'm using a Macbook pro, so it was a mac issue for me. I could go with one of the following two options to make it work:

  • I switched from Mac to AWS EC2 / Linux
  • If I didn't want to switch to EC2 / Linux, I built a Ubuntu docker image and ran it in there