When I encode videos by FFMpeg I would like to put a jpg image before the very first video frame, because when I embed the video on a webpage with "video" html5 tag, it shows the very first picture as a splash image. Alternatively I want to encode an image to an 1 frame video and concatenate it to my encoded video. I don't want to use the "poster" property of the "video" html5 element.
How can I place a still image before the first frame of a video?
15.7k Views Asked by Konstantin AtThere are 3 best solutions below
On
The answer above works for me but in my case it took too much time to execute (perhaps because it re-encodes the entire video). I found another solution that's much faster. The basic idea is:
- Create a "video" that only has the image.
- Concatenate the above video with the original one, without re-encoding.
Create a video that only has the image:
ffmpeg -loop 1 -framerate 30 -i image.jpg -c:v libx264 -t 3 -pix_fmt yuv420p image.mp4
Note the -framerate 30 option. It has to be the same with the main video. Also, the image should have the same dimension with the main video. The -t 3 specifies the length of the video in seconds.
Convert the videos to MPEG-2 transport stream
According to the ffmpeg official documentation, only certain files can be concatenated using the concat protocal, this includes the MPEG-2 transport streams. And since we have 2 MP4 videos, they can be losslessly converted to MPEG-2 TS:
ffmpeg -i image.mp4 -c copy -bsf:v h264_mp4toannexb -f mpegts image.ts
and for the main video:
ffmpeg -i video.mp4 -c copy -bsf:v h264_mp4toannexb -f mpegts video.ts
Concatenate the MPEG-2 TS files
Now use the following command to concatenate the above intermediate files:
ffmpeg -i "concat:image.ts|video.ts" -c copy -bsf:a aac_adtstoasc output.mp4
Although there are 4 commands to run, combined they're still much faster then re-encoding the entire video.
On
My solution. It sets an image with duration of 5 sec before the video along with aligning video to be 1280x720. Image should have 16/9 aspect ratio.
ffmpeg -i video.mp4 -i image.png -filter_complex '
color=c=black:size=1280x720 [temp]; \
[temp][1:v] overlay=x=0:y=0:enable='between(t,0,5)' [temp]; \
[0:v] setpts=PTS+5/TB, scale=1280x720:force_original_aspect_ratio=decrease, pad=1280:720:-1:-1:color=black [v:0]; \
[temp][v:0] overlay=x=0:y=0:shortest=1:enable='gt(t,5)' [v]; \
[0:a] asetpts=PTS+5/TB [a]'
-map [v] -map [a] -preset veryfast output.mp4
You can use the concat filter to do that. The exact command depends on how long you want your splash screen to be. I am pretty sure you don't want an 1-frame splash screen, which is about 1/25 to 1/30 seconds, depending on the video ;)
The Answer
First, you need to get the frame rate of the video. Try
ffmpeg -i INPUTand find thetbrvalue. E.g.In the above example, it shows
25 tbr. Remember this number.Second, you need to concatenate the image with the video. Try this command:
If your video doesn't have audio, try this:
FPS=tbrvalue got from step 1SECONDS= duration you want the image to be shown.IMAGE= the image nameINPUTVIDEO= the original video name[OPTIONS]= optional encoding parameters (such as-vcodec libx264or-b:a 160k)OUTPUT= the output video file nameHow Does This Work?
Let's split the command line I used:
-loop 1 -framerate FPS -t SECONDS -i IMAGE: this basically means: open the image, and loop over it to make it a video withSECONDSseconds withFPSframes per second. The reason you need it to have the same FPS as the input video is because theconcatfilter we will use later has a restriction on it.-t SECONDS -f lavfi -i aevalsrc=0: this means: generate silence for SECONDS (0 means silence). You need silence to fill up the time for the splash image. This isn't needed if the original video doesn't have audio.-i INPUTVIDEO: open the video itself.-filter_complex '[0:0] [1:0] [2:0] [2:1] concat=n=2:v=1:a=1': this is the best part. You open file 0 stream 0 (the image-video), file 1 stream 0 (the silence audio), file 2 streams 0 and 1 (the real input audio and video), andconcatenate them together. The optionsn,v, andamean that there are 2 segments, 1 output video, and 1 output audio.[OPTIONS] OUTPUT: this just means to encode the video to the output file name. If you are using HTML5 streaming, you'd probably want to use-c:v libx264 -crf 23 -c:a libfdk_aac (or -c:a libfaac) -b:a 128kfor H.264 video and AAC audio.Further information
image2demuxer which is the core of the magic behind-loop 1.concatfilter is also helpful.