What exactly is the presentation time returned by QVideoFrame::startTime() when acquiring from a webcam?

101 Views Asked by At

I was able to acquire a QVideoFrame through a QVideoSink from a webcam, but I cannot understand (after a bit of googling) what exactly QVideoFrame::startTime() is returning to me. According to the doc:

qint64 QVideoFrame::startTime() const

Returns the presentation time (in microseconds) when the frame should be displayed.

An invalid time is represented as -1.

What does presentation time refers to? It's the timestamp of the frame? What should I do with the microseconds I got from this method? I compared them with QDateTime::currentMSecsSinceEpoch() but they seem unrelated.

2

There are 2 best solutions below

0
Mohamd Imran On BEST ANSWER

The QVideoFrame::startTime() method in Qt Multimedia provides the presentation time of a video frame. This time is typically a timestamp indicating when the frame should be displayed. It's not the timestamp of the frame creation or acquisition, but rather the time at which the frame should be presented to the user.

The presentation time is often associated with synchronization in multimedia applications. It helps ensure that frames are displayed at the correct time, maintaining a smooth and synchronized playback. It's usually expressed in microseconds and represents the time when the frame should be displayed to the user.

Regarding Timestamp vs Presentation Time:

The timestamp of the frame (e.g., the time when it was captured or acquired) might not be the same as the presentation time .The presentation time takes into account factors like frame rate, synchronization, and playback timing.

Now if we compared with QDateTime::currentMSecsSinceEpoch() It returns the number of milliseconds since the epoch (January 1, 1970, UTC). It's a general-purpose timestamp. However The presentation time from QVideoFrame::startTime() is specific to the video frame and may not directly correspond to the current time

Usage :

You would typically use the presentation time to schedule the display of the frame. For example, in a video player, you would use this timestamp to determine when to show a particular frame in the video playback.

Realtime Context:

If you are capturing frames in real-time from a webcam for display purposes (e.g., in a video conferencing application, live streaming, or computer vision task , Or whatever live application ), the presentation time can help you synchronize the display of frames. You also might use the presentation time to schedule the display of frames at the appropriate moments, providing a smoother and more synchronized viewing experience . The presentation time doesn't necessarily correspond to the time when the frame was captured by the webcam. Presentation time It's more related to when the frame should be presented to the user. So If you are interested in the actual timestamp of when the frame was captured, you might want to use a different timestamping mechanism, potentially provided by the video capture API or by measuring the time when you acquire each frame. moreover real-time applications, minimizing latency is crucial. You might need to balance the usage of presentation time with the actual real-time processing needs of your application . if you are creating some real-time application and want to display frames with minimal latency, you might use the presentation time to schedule the display but also ensure that frames are processed and displayed promptly .

Code example :

qint64 presentationTime = frame.startTime();
if (presentationTime != -1) {
    // Schedule the frame to display based on `presentationTime`
    // real-time processing or rendering stuff
    displayFrame(frame);
}

Hope Those explanations help you understand the concept

0
Luca Carlon On

Presentation Timestamp (PTS) represents the time at which a multimedia element should be presented to the viewer. It is needed, for example, to playback the video at the proper speed and to sync audio and video.

You should use the PTS value to compute when the frame should be presented to the user (be it video/audio/whatever). A frame is typically decoded before it should be presented. In that case you'll have to keep it in a buffer somewhere, waiting for the proper time.

It is better to use a monotonic timer instead of a date, which may actually be changed by the user or by the system.