SRT Subtitle Processing Using Autosub & Python: Trying to remove blank subtitle entries

115 Views Asked by At

I am currently trying to make a bot to upload short-form YT content that takes a clip of minecraft stock footage and layers over it a reddit story. I am using the Reddit Api to grab the story, movie.py to clip the footage and sew all of it together (including the subtitles) and gTTS for the voiceover. My problem is that whenever I run my code normally, I get the following error on line 43 (where I have defined the function that generates my subtitles):

Exception has occurred: TypeError
cannot unpack non-iterable NoneType object
  File "C:\Users\tript\OneDrive\Documents\Coding Projects\Auto_YT_Shorts_Gen\Untitled-1.py", line 43, in <module>
    sub = SubtitlesClip("voiceover.en.srt", generator_output) # IMPROVEMENT: SORT THRU SRT LINES AND CHECK FOR BLANKS AND REMOVE THEM
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot unpack non-iterable NoneType object

The confusing part about this is that I have followed the docs religiously and it still produces an error. Here is my code:

from moviepy.editor import *
from gtts import gTTS
from moviepy.video.tools.subtitles import SubtitlesClip
from moviepy.video.io.VideoFileClip import VideoFileClip
from mutagen.mp3 import MP3
import random, os, praw, requests

r = praw.Reddit('bot1')

subs = ['AmItheAsshole', 'confession', 'AskReddit']

def getText():
    sub = r.subreddit(random.choice(subs))
    posts = sub.hot(limit=100)
    ran_post_num = random.randint(0, 100)
    for i, post in enumerate(posts):
        if i == ran_post_num:
            return post.selftext

def getRanVidStartEndPoints(total_duration, segment_duration):
    random_start = random.uniform(0, total_duration - segment_duration)
    random_end = random_start + segment_duration
    return random_start, random_end

extracted_text = getText()
confirm = extracted_text
print(extracted_text)

while "a" not in extracted_text:
    extracted_text = getText()
    extracted_text = confirm
    print(extracted_text)
if confirm == extracted_text:
    tts = gTTS(extracted_text, lang="en")
    tts.save("voiceover.mp3")

command = r'C:\Users\tript\Downloads\autosub-0.5.7-alpha-win-x64-pyinstaller\autosub_pyinstaller\autosub.exe -i "c:\Users\tript\OneDrive\Documents\Coding Projects\Auto_YT_Shorts_Gen\voiceover.mp3" -o "c:\Users\tript\OneDrive\Documents\Coding Projects\Auto_YT_Shorts_Gen" -S en -y'
os.system(command)

generator_output = lambda txt: TextClip(txt, font='Arial', fontsize=72, color='white', method="subtitle", size=my_clip.size-100)
print(generator_output)

sub = SubtitlesClip("voiceover.en.srt", generator_output) # IMPROVEMENT: SORT THRU SRT LINES AND CHECK FOR BLANKS AND REMOVE THEM
my_clip = VideoFileClip("sample.mp4")
voiceover_audio = AudioFileClip("voiceover.mp3")

audio = MP3("voiceover.mp3")
total_duration = my_clip.duration
segment_duration = audio.info.length

random_start, random_end = getRanVidStartEndPoints(total_duration, segment_duration)

video = CompositeVideoClip([my_clip.subclip(random_start, random_end),
                            sub.set_position(('center', 'center')).set_duration(random_end - random_start)])
video = video.set_audio(voiceover_audio)

video.write_videofile("sample_test.mp4", audio_codec='aac', codec='libx264', fps=my_clip.fps)

I am a beginner to Python and would appreciate any help, I am just unsure of how to solve the problem. I do, however, know that by removing empty subtitle entries in the SRT file, the error isn't produced and the rest of my code works fine. My initial approach would be to find an empty line and delete the timestamp and Subtitle number above, but every time I have tried this it hasn't worked. Here is the SRT Subtitles file:

1
00:00:00,070 --> 00:00:06,370
I'm working a temporary job over Christmas at some children's playroom event and were forced to

2
00:00:07,000 --> 00:00:09,640
Hand out these leaflets to customers

3
00:00:09,880 --> 00:00:15,300
None of them ever wanna take them so we don't give them out often but our management said they

4
00:00:15,890 --> 00:00:19,550
Woodcutter pay if we had leftover leaflets each week

5
00:00:19,810 --> 00:00:25,840


6
00:00:26,420 --> 00:00:27,700
Meet the quota

You can see that subtitle line number 5 is what i call a "empty subtitle entry" as it has no text. By removing these, the error is stopped. I just don't know how to automate this process with Python. As I said before, I am fairly new to Python so any help would be appreciated.

0

There are 0 best solutions below