I have been studying PTS values in .mp4 media files. PTS for video stream can be extracted from ffmpeg CLI using
ffmpeg -hide_banner -i -vf "showinfo" -f null -
For a sample .mp4 I have downloaded from the internet shows the following output.
Press [q] to stop, [?] for help
[Parsed_showinfo_0 @ 0x3741c00] config in time_base: 1/30, frame_rate: 30/1
[Parsed_showinfo_0 @ 0x3741c00] config out time_base: 0/0, frame_rate: 0/0
[Parsed_showinfo_0 @ 0x3741c00] n: 0 pts: 0 pts_time:0 pos: 58852 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:1 type:I checksum:49058BA3 plane_checksum:[E852D7DE 07E2B7D4 EA12FBD3] mean:[75 123 124] stdev:[52.7 4.8 11.7]
[Parsed_showinfo_0 @ 0x3741c00] side data - User Data Unregistered:
[Parsed_showinfo_0 @ 0x3741c00] UUID=dc45e9bd-e6d9-48b7-962c-d820d923eeef
[Parsed_showinfo_0 @ 0x3741c00] User Data=78323634202d20636f726520313535207231302062303062636166202d20482e3236342f4d5045472d342041564320636f646563202d20436f70796c65667420323030332d32303137202d20687474703a2f2f7777772e766964656f6c616e2e6f72672f783236342e68746d6c202d206f7074696f6e733a2063616261633d31207265663d34206465626c6f636b3d313a303a3020616e616c7973653d3078333a3078313133206d653d686578207375626d653d38207073793d31207073795f72643d312e30303a302e3030206d697865645f7265663d31206d655f72616e67653d3136206368726f6d615f6d653d31207472656c6c69733d32203878386463743d312063716d3d3020646561647a6f6e653d32312c313120666173745f70736b69703d31206368726f6d615f71705f6f66667365743d2d3220746872656164733d3334206c6f6f6b61686561645f746872656164733d3520736c696365645f746872656164733d30206e723d3020646563696d6174653d3120696e7465726c616365643d3020626c757261795f636f6d7061743d302073746974636861626c653d3120636f6e73747261696e65645f696e7472613d3020626672616d65733d3320625f707972616d69643d3220625f61646170743d3220625f626961733d30206469726563743d3320776569676874623d31206f70656e5f676f703d3020776569676874703d32206b6579696e743d696e66696e697465206b6579696e745f6d696e3d3330207363656e656375743d343020696e7472615f726566726573683d302072635f6c6f6f6b61686561643d35302072633d3270617373206d62747265653d3120626974726174653d353030302072617465746f6c3d312e302071636f6d703d302e36302071706d696e3d352071706d61783d3639207170737465703d342063706c78626c75723d32302e302071626c75723d302e35207662765f6d6178726174653d35353030207662765f62756673697a653d3135303030206e616c5f6872643d6e6f6e652066696c6c65723d302069705f726174696f3d312e34302061713d313a312e303000
[Parsed_showinfo_0 @ 0x3741c00]
[Parsed_showinfo_0 @ 0x3741c00] color_range:tv color_space:bt709 color_primaries:bt709 color_trc:bt709
Output #0, null, to 'pipe:':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41isomavc1
encoder : Lavf59.27.100
Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
Metadata:
creation_time : 2018-01-23T22:02:00.000000Z
handler_name : L-SMASH Video Handler
vendor_id : [0][0][0][0]
encoder : Lavc59.37.100 wrapped_avframe
Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, mono, s16, 768 kb/s (default)
Metadata:
creation_time : 2018-01-23T22:02:00.000000Z
handler_name : L-SMASH Audio Handler
vendor_id : [0][0][0][0]
encoder : Lavc59.37.100 pcm_s16le
[Parsed_showinfo_0 @ 0x3741c00] n: 1 pts: 1 pts_time:0.0333333 pos: 149037 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:2005D769 plane_checksum:[92BD4F7B 3501F48D 0CAA9352] mean:[75 124 124] stdev:[52.5 4.7 11.7]
[Parsed_showinfo_0 @ 0x3741c00] color_range:tv color_space:bt709 color_primaries:bt709 color_trc:bt709
[Parsed_showinfo_0 @ 0x3741c00] n: 2 pts: 2 pts_time:0.0666667 pos: 139805 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:09AFB702 plane_checksum:[3E62184D 9D0A0753 8BF09762] mean:[75 124 124] stdev:[52.4 4.6 11.5]
[Parsed_showinfo_0 @ 0x3741c00] color_range:tv color_space:bt709 color_primaries:bt709 color_trc:bt709
[Parsed_showinfo_0 @ 0x3741c00] n: 3 pts: 3 pts_time:0.1 pos: 157017 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:99F05FA9 plane_checksum:[FFA84276 7A3D6D59 0290AFCB] mean:[75 124 124] stdev:[52.2 4.5 11.3]
[Parsed_showinfo_0 @ 0x3741c00] color_range:tv color_space:bt709 color_primaries:bt709 color_trc:bt709
[Parsed_showinfo_0 @ 0x3741c00] n: 4 pts: 4 pts_time:0.133333 pos: 117259 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:P checksum:00935CD8 plane_checksum:[F81E097E 5F17005D B01452FD] mean:[74 124 124] stdev:[52.2 4.5 11.3]
[Parsed_showinfo_0 @ 0x3741c00] color_range:tv color_space:bt709 color_primaries:bt709 color_trc:bt709
[Parsed_showinfo_0 @ 0x3741c00] n: 5 pts: 5 pts_time:0.166667 pos: 197428 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:30E77B4C plane_checksum:[393DAA75 DFA88599 E2164B2F] mean:[74 124 125] stdev:[52.3 4.4 11.0]
[Parsed_showinfo_0 @ 0x3741c00] color_range:tv color_space:bt709 color_primaries:bt709 color_trc:bt709
[Parsed_showinfo_0 @ 0x3741c00] n: 6 pts: 6 pts_time:0.2 pos: 187073 fmt:yuv420p sar:1/1 s:1920x1080 i:P iskey:0 type:B checksum:BD5C25BC plane_checksum:[CC66DD70 F4ACA5DB 955DA253] mean:[75 124 125] stdev:[52.2 4.4 10.8]
As seen above, the output shows a starting PTS of 0 for 1st frame. However, I was looking at ctts, and stts entries in the MP4 headers with the help of ParseTimingInfoInMp4.py. This shows a different PTS (e.g., 0.0667s) for the 1st frame as seen below.
ftyp size 32
mvhd size 108
iods size 42
tkhd size 92
edts size 36
mdhd size 32
Trak type: b'vide'
Video Trak Number 0 found
video track timescale is 30
mdhd size 32
hdlr size 54
vmhd size 20
dinf size 36
stsd size 195
stts size 2944 ctts size 2944
0 dts = 0.0000 s, pts = 0.0667 s, diff in ms 66.67
1 dts = 0.0333 s, pts = 0.2000 s, diff in ms 166.67
2 dts = 0.0667 s, pts = 0.1333 s, diff in ms 66.67
3 dts = 0.1000 s, pts = 0.1000 s, diff in ms 0.00
4 dts = 0.1333 s, pts = 0.1667 s, diff in ms 33.33
5 dts = 0.1667 s, pts = 0.3333 s, diff in ms 166.67
6 dts = 0.2000 s, pts = 0.2667 s, diff in ms 66.67
7 dts = 0.2333 s, pts = 0.2333 s, diff in ms 0.00
8 dts = 0.2667 s, pts = 0.3000 s, diff in ms 33.33
9 dts = 0.3000 s, pts = 0.4333 s, diff in ms 133.33
10 dts = 0.3333 s, pts = 0.3667 s, diff in ms 33.33
MP4Analyser shows the following entries for stss, ctts, and edts-> for video track.

The sample file I have been using can be found in Sample mp4.
Can someone please help me to understand
- why the PTS values shown in ffmpeg are different from PTS derived from stss and ctts?
- What is the correct process in deriving PTS from stts, ctts and edts entries in MP4 header?
ffmpeg will, by default, offset pts to start from 0. Add
-copytsto avoid this.Next, the decoder will reorder frames in presentation order, which will be different from storage order. This applies when the stream has B-frames.
Edit list adjustment can get complex, so I'll address the two simple cases: an initial dwell is applied as an offset to all timestamps. An initial edit (trimming of the start of the media) leads to negative timestamps for all samples before the edit start point.