I'm currently trying to find python libraries that can assist me in extracting metadata or information from video files such as [mp4, Mkv, Avi, WebM, mpg] formats for example.
The main data that I'm focusing on extracting from the video files are mostly the [Title, Description, Comment, Captions/Subtitles].
I've tried using FFmpeg-python following this guide: https://www.thepythoncode.com/article/extract-media-metadata-in-python
and Tinytag, https://www.geeksforgeeks.org/access-metadata-of-various-audio-and-video-file-formats-using-python-tinytag-library/
From my understanding, FFmpeg-python provided the most data from the probe() function but the output does not contain [Title, Description, Comment] and closed_captions is simply '0' which I assume is the source track.
Thank you for any help provided.
You can use
ffprobe
to get the metadata:For the substitles/closed-captions, you need to read the subtitle streams with
ffmpeg
:Then you can use a library like
webvtt-py
to parse the subtitle data. (I don't have firsthand experience, so try it yourself.)One caveat though. If your video is a DVD rip, then its subtitle streams (dvd_subtitle) are bitmaps and not text, and FFmpeg cannot convert it to a text data.