MediaPipe Pose Landmarker Question - how to play a sound when certain pose is detected

60 Views Asked by At

First I'll apologize and explain I'm a UX designer, not really a developer, though I have a basic understanding of HTML, CSS, Javascript.

I want to create a super simple web-app based on this codepen that uses MediaPipe Pose Landmarker.

https://codepen.io/mediapipe-preview/pen/abRLMxN

const video = document.getElementById("webcam") as HTMLVideoElement;
const canvasElement = document.getElementById(
  "output_canvas"
) as HTMLCanvasElement;
const canvasCtx = canvasElement.getContext("2d");
const drawingUtils = new DrawingUtils(canvasCtx);

// Check if webcam access is supported.
const hasGetUserMedia = () => !!navigator.mediaDevices?.getUserMedia;

// If webcam supported, add event listener to button for when user
// wants to activate it.
if (hasGetUserMedia()) {
  enableWebcamButton = document.getElementById("webcamButton");
  enableWebcamButton.addEventListener("click", enableCam);
} else {
  console.warn("getUserMedia() is not supported by your browser");
}

// Enable the live webcam view and start detection.
function enableCam(event) {
  if (!poseLandmarker) {
    console.log("Wait! poseLandmaker not loaded yet.");
    return;
  }

  if (webcamRunning === true) {
    webcamRunning = false;
    enableWebcamButton.innerText = "ENABLE PREDICTIONS";
  } else {
    webcamRunning = true;
    enableWebcamButton.innerText = "DISABLE PREDICTIONS";
  }

  // getUsermedia parameters.
  const constraints = {
    video: true
  };

  // Activate the webcam stream.
  navigator.mediaDevices.getUserMedia(constraints).then((stream) => {
    video.srcObject = stream;
    video.addEventListener("loadeddata", predictWebcam);
  });
}

let lastVideoTime = -1;
async function predictWebcam() {
  canvasElement.style.height = videoHeight;
  video.style.height = videoHeight;
  canvasElement.style.width = videoWidth;
  video.style.width = videoWidth;
  // Now let's start detecting the stream.
  if (runningMode === "IMAGE") {
    runningMode = "VIDEO";
    await poseLandmarker.setOptions({ runningMode: "VIDEO" });
  }
  let startTimeMs = performance.now();
  if (lastVideoTime !== video.currentTime) {
    lastVideoTime = video.currentTime;
    poseLandmarker.detectForVideo(video, startTimeMs, (result) => {
      canvasCtx.save();
      canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
      for (const landmark of result.landmarks) {
        drawingUtils.drawLandmarks(landmark, {
          radius: (data) => DrawingUtils.lerp(data.from!.z, -0.15, 0.1, 5, 1)
        });
        drawingUtils.drawConnectors(landmark, PoseLandmarker.POSE_CONNECTIONS);
      }
      canvasCtx.restore();
    });
  }

  // Call this function again to keep predicting when the browser is ready.
  if (webcamRunning === true) {
    window.requestAnimationFrame(predictWebcam);
  }
}

Essentially I want to use just the continuous webcam output - and have it detect when the users hand reaches their face. At that point it should play an audio track (alarm) until the hand is taken away, at which point it stops.

I feel like this is probably super simple, but I can't find the documentation for how to pinpoint the output from a certain pose (or for the certain landmark locations of the hand for instance, in which case the alarm could just play once the hand passes a certain Y axis point in the field of vision (probably 30%))

CONTEXT The idea is this is a simple app to get me to stop playing with my beard while working at the computer! So the idea is the field of view would just be close-up webcam / sitting at a desk.

Any help would be much appreciated!

Thanks, Gareth

I feel like I was close with this page, but wasn't sure how to translate it:

https://developers.google.com/mediapipe/solutions/vision/pose_landmarker

0

There are 0 best solutions below