Calculating Fps-shift in two videos

28 Views Asked by At

I have two videos. One of them is a reaction video to a TV-series, the other is the episode they watched. The episode is 23.97 fps, the reaction video itself is 30 fps, but the episode they watch in the reaction video is not either. By trying it manually I see that it is between 25fps and 26fps. If I can make the fps of the episode equal to the fps of the video they watch, all my problems are solved. But I don't know how to calculate this difference in Python. If the reaction video ended at the same time as the episode, I could synchronize it using durations, but unfortunately I can't because in some reaction videos they listen to the ending music, in others they don't, sometimes they even cut it in half.

My best approach so far was to calculate the sample rate shift of the first 1 minute of the two videos with cross-correlation to the audio files (reaction video has same audio with episode video). But sample rates and sounds are not something I have much knowledge about. I don't even know what I found so I cant calculated an approximate fps difference from this and added this difference to 23.97.

Can I convert this sample-rate difference I found into an approximate fps difference? Or do I have a better approach?

I would be very happy if someone can guide me.

Simply cross correlation I use is as follows:

import numpy as np
import librosa
from scipy.signal import correlate
import noisereduce as nr
import soundfile as sf

def normalize_audio(audio1, audio2):
    max_value = max(np.max(np.abs(audio1)), np.max(np.abs(audio2)))
    normalized_audio1 = audio1 / max_value
    normalized_audio2 = audio2 / max_value
    return normalized_audio1, normalized_audio2

reaction_audio, sr_reaction = librosa.load("./Audios/reaction_audio.wav")
episode_audio, sr_episode = librosa.load("./Audios/episode_audio.wav")
reaction_audio, episode_audio = normalize_audio(reaction_audio, episode_audio)
print(sr_reaction,sr_episode)
#reaction_audio = nr.reduce_noise(y=reaction_audio, sr=sr_reaction)
#episode_audio = nr.reduce_noise(y=episode_audio, sr=sr_episode)


#sf.write('./Audios/reaction_audio_fixed.wav', reaction_audio, sr_reaction)
#sf.write('./Audios/episode_audio_fixed.wav', episode_audio, sr_episode)

cross_corr = correlate(reaction_audio, episode_audio, mode='full')

episode_fps=23.976023976023978
max_corr_index = np.argmax(cross_corr)```
0

There are 0 best solutions below