Transcoding with the GCP Transcoder API results in a time gap

1.1k Views Asked by At

I've been trying GCP's Transcoder API and having trouble with the time randomly becoming shorter than specified in certain cases.

The specific cases are as follows:

  1. Specifying startTimeOffset and endTimeOffset (cut off 2 seconds before and after the video)
  2. fMP4 is used as container
  3. input video is mp4 with screen recording on iPad Pro

For example, if I don't specify startTimeOffset and endTimeOffset, the time will not be shortened. Also, there is no problem when MPEG2-TS is specified for container. There may be a problem with the video itself, but I haven't found a clue how to set it up.

I'm not sure whether this is a problem with the Transcoder API or with me.

The test input video: https://gofile.io/d/DUT9rr

❯ ffprobe input.mp4
ffprobe version 4.3.1 Copyright (c) 2007-2020 the FFmpeg developers
  built with Apple clang version 12.0.0 (clang-1200.0.32.28)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3.1_8 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 1
    compatible_brands: isommp41mp42
    creation_time   : 2021-02-26T15:08:58.000000Z
  Duration: 00:02:51.15, start: 0.000000, bitrate: 551 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 246 kb/s (default)
    Metadata:
      creation_time   : 2021-02-26T15:08:58.000000Z
      handler_name    : Core Media Audio
    Stream #0:1(und): Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, bt709/bt709/iec61966-2-1), 1920x1342, 302 kb/s, 12.39 fps, 120 tbr, 600 tbn, 1200 tbc (default)
    Metadata:
      rotate          : 180
      creation_time   : 2021-02-26T15:08:58.000000Z
      handler_name    : Core Media Video
    Side data:
      displaymatrix: rotation of -180.00 degrees
#!/bin/bash -eu

cat > request.json << EOF
{
  "config": {
    "inputs": [
      {
        key: "input0"
      }
    ],
    "editList": [
      {
        "key": "atom0",
        "inputs": [
          "input0"
        ],
        "startTimeOffset": "2s",
        "endTimeOffset": "169s",
      },
    ],
    "elementaryStreams": [
      {
        "videoStream": {
          "codec": "h265",
          "heightPixels": 480,
          "bitrateBps": 1200000,
          "rateControlMode": "vbr",
          "enableTwoPass": true,
          "frameRate": 30,
          "crfLevel": 31,
          "gopDuration": "3.0s",
        },
        "key": "h265-stream0"
      },
      {
        "videoStream": {
          "codec": "h265",
          "heightPixels": 720,
          "bitrateBps": 1550000,
          "rateControlMode": "vbr",
          "enableTwoPass": true,
          "frameRate": 30,
          "crfLevel": 31,
          "gopDuration": "3.0s",
        },
        "key": "h265-stream1"
      },
      {
        "videoStream": {
          "codec": "h265",
          "heightPixels": 1080,
          "bitrateBps": 2600000,
          "rateControlMode": "vbr",
          "enableTwoPass": true,
          "frameRate": 30,
          "crfLevel": 31,
          "gopDuration": "3.0s",
        },
        "key": "h265-stream2"
      },
      {
        "audioStream": {
          "codec": "aac",
          "bitrateBps": 64000,
          "channelCount": 2,
          "channelLayout": [
            "fl",
            "fr"
          ],
          "sampleRateHertz": 48000
        },
        "key": "audio-stream0"
      },
    ],
    "muxStreams": [
      {
        "key": "media-sd",
        "fileName": "media-sd.m4s",
        "container": "fmp4",
        "elementaryStreams": [
          "h265-stream0",
        ],
        "segmentSettings": {
          "individualSegments": true
        },
      },
      {
        "key": "media-hd",
        "fileName": "media-hd.m4s",
        "container": "fmp4",
        "elementaryStreams": [
          "h265-stream1",
        ],
        "segmentSettings": {
          "individualSegments": true
        },
      },
      {
        "key": "media-fhd",
        "fileName": "media-fhd.m4s",
        "container": "fmp4",
        "elementaryStreams": [
          "h265-stream2",
        ],
        "segmentSettings": {
          "individualSegments": true
        },
      },
      {
        "key": "audio-only",
        "fileName": "audio-only.m4s",
        "container": "fmp4",
        "elementaryStreams": [
          "audio-stream0"
        ],
        "segmentSettings": {
          "individualSegments": true
        },
      },
    ],
    "manifests": [
      {
        "fileName": "manifest-h265.mpd",
        "type": "DASH",
        "muxStreams": [
          "media-sd",
          "media-hd",
          "media-fhd",
          "audio-only",
        ]
      },
    ]
  }
}
EOF

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://transcoder.googleapis.com/v1beta1/projects/MY_PROJECT/locations/asia-east1/jobTemplates?jobTemplateId=test-template
#!/bin/bash -eu

cat > request.json << EOF
{
  "inputUri": "gs://my-bucket/input.mp4",
  "outputUri": "gs://my-bucket/output/",
  "templateId": "test-template"
}
EOF

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://transcoder.googleapis.com/v1beta1/projects/MY_PROJECT/locations/asia-east1/jobs

The following is a ffprobe of the resulting manifest file, which is 2 seconds shorter than specified. (expected: 00:02:47.00, actual: 00:02:45.00) In this case, it's a 2-second gap, but it can be 10 seconds or 30 seconds, and it varies from video to video.

❯ ffprobe manifest-h265.mpd
ffprobe version 4.3.1 Copyright (c) 2007-2020 the FFmpeg developers
  built with Apple clang version 12.0.0 (clang-1200.0.32.28)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3.1_8 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, dash, from 'manifest-h265.mpd':
  Duration: 00:02:45.00, start: 0.000000, bitrate: 0 kb/s
  Program 0
    Stream #0:0: Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709/unknown/unknown), 686x480, 112 kb/s, 30 fps, 120 tbr, 10k tbn, 30 tbc
    Metadata:
      variant_bitrate : 113679
      id              : 113679
    Stream #0:1: Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709/unknown/unknown), 1030x720, 205 kb/s, 30 fps, 120 tbr, 10k tbn, 30 tbc
    Metadata:
      variant_bitrate : 189219
      id              : 189219
    Stream #0:2: Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709/unknown/unknown), 1544x1080, 384 kb/s, 30 fps, 120 tbr, 10k tbn, 30 tbc
    Metadata:
      variant_bitrate : 358043
      id              : 358043
    Stream #0:3: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 65 kb/s
    Metadata:
      variant_bitrate : 70245
      id              : 70245

The following is the time as specified for h264 + MPEG2-TS + Apple HLS.

❯ ffprobe manifest-h264.m3u8 | pbcopy
ffprobe version 4.3.1 Copyright (c) 2007-2020 the FFmpeg developers
  built with Apple clang version 12.0.0 (clang-1200.0.32.28)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3.1_8 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
[hls @ 0x7fe23100f200] Opening 'h264-sd-ts.m3u8' for reading
[hls @ 0x7fe23100f200] Skip ('#EXT-X-VERSION:4')
[hls @ 0x7fe23100f200] Opening 'h264-hd-ts.m3u8' for reading
[hls @ 0x7fe23100f200] Skip ('#EXT-X-VERSION:4')
[hls @ 0x7fe23100f200] Opening 'h264-fhd-ts.m3u8' for reading
[hls @ 0x7fe23100f200] Skip ('#EXT-X-VERSION:4')
[hls @ 0x7fe23100f200] Opening 'h264-sd0000000000.ts' for reading
[hls @ 0x7fe23100f200] Opening 'h264-hd0000000000.ts' for reading
[hls @ 0x7fe23100f200] Opening 'h264-fhd0000000000.ts' for reading
Input #0, hls, from 'manifest-h264.m3u8':
  Duration: 00:02:47.00, start: 0.000000, bitrate: 0 kb/s
  Program 0
    Metadata:
      variant_bitrate : 511576
    Stream #0:0: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p, 686x480, 120 tbr, 90k tbn, 2000k tbc
    Metadata:
      variant_bitrate : 511576
    Stream #0:1: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp
    Metadata:
      variant_bitrate : 511576
  Program 1
    Metadata:
      variant_bitrate : 793711
    Stream #0:2: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p, 1030x720, 120 tbr, 90k tbn, 2000k tbc
    Metadata:
      variant_bitrate : 793711
    Stream #0:3: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp
    Metadata:
      variant_bitrate : 793711
  Program 2
    Metadata:
      variant_bitrate : 1305288
    Stream #0:4: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p, 1544x1080, 120 tbr, 90k tbn, 2000k tbc
    Metadata:
      variant_bitrate : 1305288
    Stream #0:5: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp
    Metadata:
      variant_bitrate : 1305288
0

There are 0 best solutions below