I wanted to implement the pause/resume feature using sounddevice. I found out that this library doesn't have a dedicated method to achieve this as other libraries like PyGame. But for my specific case, PyGame didn't work since I was reading the audio data from a binary file .atm, which is just a made-up extension our professor forced us to use. But still applies.
So, how can you achieve this? First I use scipy.io.wavfile to read the samplerate and data like this.
samplerate, data = wavfile.read('audio.wav')
With sounddevice you can use sd.play(data, samplerate) to reproduce the audio file. And to pause and resume you can use this code:
import time
import sounddevice as sd
def toggle_playback():
"""Toggles audio playback between playing and paused states."""
global playback_state, current_position, start_timestamp, accumulated_time
if playback_state == "playing":
sd.stop()
end_timestamp = time.time()
accumulated_time += end_timestamp - start_timestamp
current_position = int(samplerate * accumulated_time)
playback_state = "paused"
play_pause_button.config(text="Resume")
else: # Audio is paused
sd.play(data[current_position:], samplerate)
start_timestamp = time.time()
playback_state = "playing"
# ... (Rest of your code)
In my case I'm using tkinter with a simple interface, and I have a button that toggles resume and pause, but you can separate the if statement into two functions and call them as necessary.
Basically what I'm doing is tracking the time when the audio starts reproducing and when it ends. I subtract them to get the elapsed time and then calculate the index (current_position) of the array data by multiplying it by the samplerate. Which is the number of samples per second of the audio file. This array is probably huge, in my case a 16 sec audio has shape of (705600, 2). And then when hitting resume, I just pass the array from the the position current_position and forward.
Hope this can help someone!
I tried using other libraries, but only found PyGame and it doesn't accept two-dimensional arrays containing the audio data.