I'm using Ollama library https://github.com/ollama/ollama-python for Python to build a sample application to describe images captured by webcam.
import time
import cv2
import ollama
from ollama import generate
capture = cv2.VideoCapture(0)
success, frame = capture.read()
ret, buffer = cv2.imencode('.jpg', frame)
frame = buffer.tobytes()
start = time.time()
response = ollama.generate(
model="llava",
prompt="Describe the image",
images=[frame],
stream=False
)
end = time.time()
print(response['response'])
print(end - start)
The process delays up to 3 minutes for each frame for a laptop with the following specs:
INVESP00246 13th Gen Intel(R) Core(TM) i7-1365U 1.80 GHz 32,0 GB 64 bits, x64
How can I optimize a litle bit the process?