I'm a newbie in DL/Real time object detection area and tring to learn some stuff from youtube. I watched a video https://www.youtube.com/watch?v=DLngCtsG3bk on youtube about real time custom object detection on yolov3 and I did all stepts correctly. When I run my object-detection python file on my laptop its running on CPU and I can only receive 2-3 fps. Please can anyone tell me how can I use my GPU for running it. The files I got yolov3_training_last.weights, yolov3_testing.cfg and classes.txt. The python file is included below and when I run it as I said I get low fps. If is there any way to increase it with using GPU, please teach me. Thank you in advance.
Python File:
import cv2
import numpy as np
import time
net = cv2.dnn.readNet('yolov3_training_last.weights', 'yolov3_testing.cfg')
classes = []
with open("classes.txt", "r") as f:
classes = f.read().splitlines()
cap = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_PLAIN
colors = np.random.uniform(0, 255, size=(100, 3))
# used to record the time when we processed last frame
prev_frame_time = 0
# used to record the time at which we processed current frame
new_frame_time = 0
while True:
_, img = cap.read()
height, width, _ = img.shape
blob = cv2.dnn.blobFromImage(img, 1/255, (416, 416), (0,0,0), swapRB=True, crop=False)
net.setInput(blob)
output_layers_names = net.getUnconnectedOutLayersNames()
layerOutputs = net.forward(output_layers_names)
boxes = []
confidences = []
class_ids = []
for output in layerOutputs:
for detection in output:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.95:
center_x = int(detection[0]*width)
center_y = int(detection[1]*height)
w = int(detection[2]*width)
h = int(detection[3]*height)
x = int(center_x - w/2)
y = int(center_y - h/2)
boxes.append([x, y, w, h])
confidences.append((float(confidence)))
class_ids.append(class_id)
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.2, 0.4)
if len(indexes)>0:
for i in indexes.flatten():
x, y, w, h = boxes[i]
crop_img = img[y:y+h, x:x+w]
(B, G, R) = [int(x) for x in cv2.mean(crop_img)[:3]]
roi_face= img[y: y+ h, x:x+w]
roi_face = cv2.blur(roi_face,(20,20))
img[y: y+ h, x: x+ w]=[0,0,0]
img[y: y+ h, x: x+ w]=cv2.add(roi_face, img[y: y+ h, x: x+ w])
label = str(classes[class_ids[i]])
confidence = str(round(confidences[i],2))
color = colors[i]
cv2.rectangle(img, (x,y), (x+w, y+h), color, 2)
cv2.putText(img, label + " " + confidence, (x, y+20), font, 1, (255,255,255), 2)
font = cv2.FONT_HERSHEY_SIMPLEX
# time when we finish processing for this frame
new_frame_time = time.time()
# Calculating the fps
# fps will be number of frame processed in given time frame
# since their will be most of time error of 0.001 second
# we will be subtracting it to get more accurate result
fps = 1/(new_frame_time-prev_frame_time)
prev_frame_time = new_frame_time
# converting the fps into integer
fps = int(fps)
# converting the fps to string so that we can display it on frame
# by using putText function
fps = str(fps)
# putting the FPS count on the frame
cv2.putText(img, fps, (7, 70), font, 3, (100, 255, 0), 3, cv2.LINE_AA)
cv2.imshow('Image', img)
key = cv2.waitKey(1)
if key==27:
break
cap.release()
cv2.destroyAllWindows()
i think you need to install a version of
Tensorflowthat supports GPU. you then have to specify which device to use, either your CPU or GPU.Plus your GPU must be able to run the program. So You need to install related libraries. if you have
AMDGPU it will be hard since they dont supportTensorflowGPU i guess. As forNvidiathere are drivers and libraries for it.Your other option is using
Pytorch, which again you need to have the correct hardware and software.Another way is creating the model from scratch using
KerasandPlaidML