How To process .ogg audio segments from within .py script through lxterminal of Debian?

124 Views Asked by At

My objective is to create a silence file from any given .ogg audio file, selecting only the silent sections. I have readied a rudimentary script to accomplish it. The problem is that the head pydub import AudioSegment appears (of course my programming knowledge is very limited) to be pulling the entire ogg file into the RAM and bringing the system to a standstill.

The problem doesn't become apparent for small audio files, but becomes glaring for large files, e.g., postcasts of around 1 hour or more.

Would I be better placed to use the head, from oggvideotools import OggFile rather than pydub? Is this at all possible? My system doesn't support the line.

My system is installed from the ISO of Official Debian GNU/Linux Live 11.6.0 lxde 2022-12-17T11:46

# apt list --installed | grep -i '^python'

python-apt-common/oldstable,now 2.2.1 all [installed,automatic]

python-matplotlib-data/oldstable,now 3.3.4-1 all [installed,automatic]

python3-apt/oldstable,now 2.2.1 amd64 [installed,automatic]

python3-attr/oldstable,now 20.3.0-1 all [installed,automatic]

python3-automat/oldstable,now 20.2.0-1 all [installed,automatic]

python3-bcrypt/oldstable,now 3.1.7-4 amd64 [installed,automatic]

python3-brlapi/oldstable,now 6.3+dfsg-1+deb11u1 amd64 [installed,automatic]

python3-bs4/oldstable,now 4.9.3-1 all [installed,automatic]

python3-cairo/oldstable,now 1.16.2-4+b2 amd64 [installed,automatic]

python3-certifi/oldstable,now 2020.6.20-1 all [installed,automatic]

python3-cffi-backend/oldstable,now 1.14.5-1 amd64 [installed,automatic]

python3-chardet/oldstable,now 4.0.0-1 all [installed,automatic]

python3-click/oldstable,now 7.1.2-1 all [installed,automatic]

python3-colorama/oldstable,now 0.4.4-1 all [installed,automatic]

python3-constantly/oldstable,now 15.1.0-2 all [installed,automatic]

python3-cryptography/oldstable,now 3.3.2-1 amd64 [installed,automatic]

python3-cups/oldstable,now 2.0.1-4+b1 amd64 [installed,automatic]

python3-cupshelpers/oldstable,now 1.5.14-1 all [installed,automatic]

python3-cycler/oldstable,now 0.10.0-3 all [installed,automatic]

python3-dateutil/oldstable,now 2.8.1-6 all [installed,automatic]

python3-dbus/oldstable,now 1.2.16-5 amd64 [installed,automatic]

python3-debian/oldstable,now 0.1.39 all [installed,automatic]

python3-debianbts/oldstable,now 3.1.0 all [installed,automatic]

python3-distro/oldstable,now 1.5.0-1 all [installed,automatic]

python3-geoip/oldstable,now 1.3.2-3+b3 amd64 [installed,automatic]

python3-gi-cairo/oldstable,now 3.38.0-2 amd64 [installed,automatic]

python3-gi/oldstable,now 3.38.0-2 amd64 [installed,automatic]

python3-hamcrest/oldstable,now 1.9.0-3 all [installed,automatic]

python3-html5lib/oldstable,now 1.1-3 all [installed,automatic]

python3-httplib2/oldstable,now 0.18.1-3 all [installed,automatic]

python3-hyperlink/oldstable,now 19.0.0-2 all [installed,automatic]

python3-ibus-1.0/oldstable,now 1.5.23-2 all [installed,automatic]

python3-idna/oldstable,now 2.10-1 all [installed,automatic]

python3-incremental/oldstable,now 17.5.0-1 all [installed,automatic]

python3-kiwisolver/oldstable,now 1.3.1-1+b1 amd64 [installed,automatic]

python3-ldb/oldstable,oldstable-security,now 2:2.2.3-2~deb11u2 amd64

[installed,automatic]

python3-libtorrent/oldstable,now 1.2.9-0.3 amd64 [installed,automatic]

python3-libvoikko/oldstable,now 4.3-1 all [installed,automatic]

python3-louis/oldstable,now 3.16.0-1 all [installed,automatic]

python3-lxml/oldstable,oldstable-security,now 4.6.3+dfsg-0.1+deb11u1

amd64 [installed,automatic]

python3-mako/oldstable,now 1.1.3+ds1-2 all [installed,automatic]

python3-markupsafe/oldstable,now 1.1.1-1+b3 amd64 [installed,automatic]

python3-matplotlib/oldstable,now 3.3.4-1 amd64 [installed]

python3-minimal/oldstable,now 3.9.2-3 amd64 [installed,automatic]

python3-numpy/oldstable,now 1:1.19.5-1 amd64 [installed,automatic]

python3-olefile/oldstable,now 0.46-3 all [installed,automatic]

python3-openssl/oldstable,now 20.0.1-1 all [installed,automatic]

python3-pil/oldstable,oldstable-security,now 8.1.2+dfsg-0.3+deb11u1

amd64 [installed,automatic]

python3-pkg-resources/oldstable,now 52.0.0-4 all [installed,automatic]

python3-pyasn1-modules/oldstable,now 0.2.1-1 all [installed,automatic]

python3-pyasn1/oldstable,now 0.4.8-1 all [installed,automatic]

python3-pyatspi/oldstable,now 2.38.1-1 all [installed,automatic]

python3-pycurl/oldstable,now 7.43.0.6-5 amd64 [installed,automatic]

python3-pydub/oldstable,now 0.24.1-1 all [installed]

python3-pygame/oldstable,now 1.9.6+dfsg-4+b1 amd64 [installed,automatic]

python3-pyparsing/oldstable,now 2.4.7-1 all [installed,automatic]

python3-pysimplesoap/oldstable,now 1.16.2-3 all [installed,automatic]

python3-pyxattr/oldstable,now 0.7.2-1+b1 amd64 [installed,automatic]

python3-rencode/oldstable,now 1.0.6-1+b3 amd64 [installed,automatic]

python3-reportbug/oldstable,now 7.10.3+deb11u1 all [installed,automatic]

python3-requests/oldstable,now 2.25.1+dfsg-2 all [installed,automatic]

python3-scour/oldstable,now 0.38.2-1 all [installed,automatic]

python3-service-identity/oldstable,now 18.1.0-6 all [installed,automatic]

python3-setproctitle/oldstable,now 1.2.1-1+b1 amd64 [installed,automatic]

python3-six/oldstable,now 1.16.0-2 all [installed,automatic]

python3-smbc/oldstable,now 1.0.23-1+b1 amd64 [installed,automatic]

python3-soupsieve/oldstable,now 2.2.1-1 all [installed,automatic]

python3-speechd/oldstable,now 0.10.2-2+deb11u2 all [installed,automatic]

python3-talloc/oldstable,now 2.3.1-2+b1 amd64 [installed,automatic]

python3-tk/oldstable,now 3.9.2-1 amd64 [installed,automatic]

python3-twisted-bin/oldstable,now 20.3.0-7+deb11u1 amd64 [installed,automatic]

python3-twisted/oldstable,now 20.3.0-7+deb11u1 all [installed,automatic]

python3-uno/oldstable,oldstable-security,now 1:7.0.4-4+deb11u7 amd64

[installed,automatic]

python3-urllib3/oldstable,now 1.26.5-1~exp1 all [installed,automatic]

python3-webencodings/oldstable,now 0.5.1-2 all [installed,automatic]

python3-xdg/oldstable,now 0.27-2 all [installed,automatic]

python3-zope.interface/oldstable,now 5.2.0-1 amd64 [installed,automatic]

python3.9-minimal/oldstable,now 3.9.2-1 amd64 [installed,automatic]

python3.9/oldstable,now 3.9.2-1 amd64 [installed,automatic]

python3/oldstable,now 3.9.2-3 amd64 [installed,automatic]

I am an amateur user, not into a computer-related profession, so am not competent to write efficient programs. I wanted this program to run even with 10MB RAM space.

In abstraction, I can visualise that I could read the .ogg file, bring up data in 5ms units from the file, chunk by chunk, on to the RAM, check the silence with the threshold of any pre-determined dB, say -65dB. If the chunk is 'silent' then write and append it to the file “output_silence.wav” on HDD. Otherwise, drop the chunk, But in reality this isn't working with an audio .ogg file of 50+MB size the way intended. The "output_silence.wav" is extracted correctly, but after putting the system under extreme stress and to a near standstill.

Where is my mistake?

How could this problem be overcome? If oggvideotools could help me accomplish this objective better than pydub header.

Could the code be run, the problem be indentified and an explicit solution be given please?

I wrote a simple code with a script file to extract silence portions with background noise only from an audio file. In script file entering within the script file itself all the necessary lines for (1) file name, (2) pre-determined silence level in dB, and (3) the filename for a new file. To enter the values of these variables within the said python script file by hand.

I wanted a customised code/script that could be run on an audio file.

I wanted to input the name of the master file. It would be untouched, accessed in read-only mode. The code is to open and read the named file from the beginning in small chunks of 4096 bytes or 500 milliseconds to reduce memory overload. Now, select the portions of silence segments, according to a level of dB set by me, from the beginning of the sound file. If the silence is lower than my mentioned and recorded level on the script file, then copy the silence portions, according to the entered silence level in dB, and paste the copied audio clip to a temporary file, appending the file. In this way, select the silent portion in increasing flow of time as per the dB set in the code, copy each segment and paste to append the said new file created for this purpose. Select, Copy, Paste to append, sections by sections, starting from the beginning of the input file, along the increasing length of the file, going to the very end of the input file. Each selected and copied silence segments should be pasted to append the said file saved, sequentially one after the one before in time, none of the later segments overwriting the earlier.

The final file on the Hard disk be named as "output_silence.wav".

from pydub import AudioSegment

def extract_silence(input_file, output_file, silence_threshold):
    sound = AudioSegment.from_file(input_file)

    chunk_size = 500  # Size of chunks to be processed (in milliseconds)
    position = 0

    silent_audio = AudioSegment.empty()

    while position < len(sound):
        chunk = sound[position : position + chunk_size]

        if chunk.dBFS < silence_threshold:
            silent_audio += chunk

        position += chunk_size

    silent_audio.export(output_file, format="wav")

# Replace these variables with your desired input and output file names
# input_file = "file_name.ogg"
input_file = "input_file.ogg"
output_file = "output_silence.wav"
silence_threshold = -65  # Adjust the silence threshold as needed

extract_silence(input_file, output_file, silence_threshold)
print("Silent parts extraction complete.")

It is working alright. But it is severely slowing down the system to a near-halt. Filling up the RAM, about 2GB for a 50MB ogg file. The entire 50+ GB file is being converted into an uncompressed format within the RAM, then processed, the temporary silence file is being created in the RAM, and finally the output silence file is being written down on the HDD. I don't want the entire files be uncompressed, processed or created in the RAM. I would like to use the RAM minimally, instead use the HDD wherever possible.

0

There are 0 best solutions below