When producing H.264 frames and decoding them using pyAV, packets are parsed from frames only when invoking the parse
methods twice.
Consider the following test H.264 input, created using:
ffmpeg -f lavfi -i testsrc=duration=10:size=1280x720:rate=30 -f image2 -vcodec libx264 -bsf h264_mp4toannexb -force_key_frames source -x264-params keyint=1:scenecut=0 "frame-%4d.h264"
Now, using pyAV to parse the first frame:
import av
codec = av.CodecContext.create('h264', 'r')
with open('/path/to/frame-0001.h264', 'rb') as file_handler:
chunk = file_handler.read()
packets = codec.parse(chunk) # This line needs to be invoked twice to parse packets
packets remain empty unless the last line is invoked again (packets = codec.parse(chunk)
)
Also, for different real life examples I cannot characterize, it seems that decoding frames from packets also require several decode invocations:
packet = packets[0]
frames = codec.decode(packet) # This line needs to be invoked 2-3 times to actually receive frames.
Does anyone know anything about this incosistent behavior of pyAV?
(Using Python 3.8.12 on macOS Monterey 12.3.1, ffmpeg 4.4.1, pyAV 9.0.2)
This is an expected PyAV behavior. Not only, it is an expected behavior of the underlying
libav
. One packet does not guarantee a frame, and multiple packets may be needed before producing a frame. This is apparent in FFmpeg's video decoder example:If it needs more packets to form a frame, it throws the
EAGAIN
error.[edit]
Actually, the above example is not a good example as it just exits on
EAGAIN
. To retrieve a frame, it should rathercontinue
onEAGAIN
:[edit]
pyav's
codec.parse()
The decoding sometimes needing additional calls is a fairly well-known fact, but the parser needing to flush is less common. Here is the difference between PyAV and FFmpeg:
PyAV parses the input data with
av_parser_parse2()
like this [ref]:So it reads until the input data is 100% consumed and note that it does not call
av_parser_parse2
at end of buffer (which makes sense as the input data may be only a part of the stream data.In contrast, FFmpeg does not call
av_parser_parse2
directly and usesparse_packet
and you can see how it handles the similar situation:It calls
av_parser_parse2
also to flush the stream after input data stream is exhausted. So, you need to do the same in PyAV: after all your frames are read, callcodec.parse()
one last time to flush the last packet.