Why RTP's timestamp for video payload use a 90 kHz clock rate?

3.5k Views Asked by At

I find that many RFCs said that:

A 90 kHz clock rate MUST be used.

But I don't get the root reason for this.

3

There are 3 best solutions below

0
On BEST ANSWER

You can find the answer in "RTP: Audio and Video for the Internet" by Colin Perkins p.154

In short, such rate is chosen so that the frame rates that are common to majority of the formats will have integer timestamp increment i.e. the division still can have reminder but it will be of negligible range.

0
On

The 90kHz in RTP is derived from the presentation time stamp (PTS) in an mpeg transport stream. The PTS is used to synchronize a program's separate streams e.g. video, audio and subtitles.

0
On

I think this explanation in rfc3551 seems more covincing.

All of these video encodings use an RTP timestamp frequency of 90,000 Hz, the same as the MPEG presentation time stamp frequency. This frequency yields exact integer timestamp increments for the typical 24 (HDTV), 25 (PAL), and 29.97 (NTSC) and 30 Hz (HDTV) frame rates and 50, 59.94 and 60 Hz field rates. While 90 kHz is the RECOMMENDED rate for future video encodings used within this profile, other rates MAY be used. However, it is not sufficient to use the video frame rate (typically between 15 and 30 Hz) because that does not provide adequate resolution for typical synchronization requirements when calculating the RTP timestamp corresponding to the NTP timestamp in an RTCP SR packet. The timestamp resolution MUST also be sufficient for the jitter estimate contained in the receiver reports.

For most of these video encodings, the RTP timestamp encodes the sampling instant of the video image contained in the RTP data packet. If a video image occupies more than one packet, the timestamp is the same on all of those packets. Packets from different video images are distinguished by their different timestamps.

Most of these video encodings also specify that the marker bit of the RTP header SHOULD be set to one in the last packet of a video frame and otherwise set to zero. Thus, it is not necessary to wait for a following packet with a different timestamp to detect that a new frame should be displayed.