How can I remove distortion introduced by librosa griffin lim?

1.2k Views Asked by At

I'm doing:

import librosa


D = librosa.stft(samples, n_fft=nperseg, 
                 hop_length=overlap, win_length=nperseg,
                 window=scipy.signal.windows.hamming)

spect, _ = librosa.magphase(D)

audio_signal = librosa.griffinlim(spect, n_iter=1024, 
                                  win_length=nperseg, hop_length=overlap, 
                                  window=signal.windows.hamming)
print(audio_signal, audio_signal.shape)
sf.write('test.wav', audio_signal, sample_rate)

And it is introducing noticeable distortion in the reconstructed audio signal. What can I do to improve that?

2

There are 2 best solutions below

0
On

You need to use a window function that is centered so that the windowed signal is zero-phase, i.e. it is perfectly symmetrical around the middle of the window. In this case, you can use the hann window, which is a raised cosine window with non-zero endpoints.

D = librosa.stft(samples, n_fft=nperseg, 
                 hop_length=overlap, win_length=nperseg,
                 window=scipy.signal.windows.hann)

spect, _ = librosa.magphase(D)

audio_signal = librosa.griffinlim(spect, n_iter=1024, 
                                  win_length=nperseg, hop_length=overlap, 
                                  window=signal.windows.hann)
print(audio_signal, audio_signal.shape)
sf.write('test.wav', audio_signal, sample_rate)
0
On

You should use neural network based vocoder like WaveNet for reconstruction