I have this code for real-time action detection using MediaPipe and a deep learning model. I want to create a Django video streaming web app and integrate this code to detect actions from the frames. I have tried putting the entire code in a class (along with the function definitions of each function called in the code) in a separate python script (camera.py). I then imported the entire class into my views.py but the frames don't show when I run the server. This is the code I would like to integrate into the Django app:
sequence = []
sentence = []
predictions = []
threshold = 0.5
cap = cv2.VideoCapture(0)
# Set mediapipe model
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
while cap.isOpened():
# Read feed
ret, frame = cap.read()
# Make detections
image, results = mediapipe_detection(frame, holistic)
print(results)
# Draw landmarks
draw_styled_landmarks(image, results)
# 2. Prediction logic
keypoints = extract_keypoints(results)
sequence.append(keypoints)
sequence = sequence[-30:]
if len(sequence) == 30:
res = model.predict(np.expand_dims(sequence, axis=0))[0]
print(actions[np.argmax(res)])
predictions.append(np.argmax(res))
#3. Viz logic
if np.unique(predictions[-10:])[0]==np.argmax(res):
if res[np.argmax(res)] > threshold:
if len(sentence) > 0:
if actions[np.argmax(res)] != sentence[-1]:
sentence.append(actions[np.argmax(res)])
else:
sentence.append(actions[np.argmax(res)])
if len(sentence) > 5:
sentence = sentence[-5:]
# Viz probabilities
image = prob_viz(res, actions, image, colors)
cv2.rectangle(image, (0,0), (640, 40), (245, 117, 16), -1)
cv2.putText(image, ' '.join(sentence), (3,30),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
# Show to screen
cv2.imshow('OpenCV Feed', image)
# Break gracefully
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
I would like to know what would be in the camera.py, views.py, urls.py and the html template. Thanks so much!
Your camera.py would not work with Django website. In your current code you are fetching video directly from the device which is running your script. Thus if you run the camera.py in client-server environment, it would search for the camera on the server, but you want to capture the stream from user.
Thus, to achieve what you want, you would need to setup socket connection with user and continuously capture and send data from there to your server, or just use WebRTC for the video capturing.
This is a little complex process which I cannot fully demonstrate in a code, please search these topics and it should give you some insight.