Generate MFCC with good noise for an audio signal of 0.01 seconds

37 Views Asked by At

I'm using 16000Hz, mono WAV files for my CNN project. Here is the code for MFCC generation

import librosa
import numpy as np
signal, sr = librosa.load('test.wav', sr=None)
mfccs = np.mean(librosa.feature.mfcc(y=signal[0:160], sr=16000, n_fft=160,
                                    n_mfcc=20, n_mels=50).T, axis=0)

The problem is when using this mfcc-based tensorflow model for testing and prediction, all the predictions are almost equal to 0.99. When using a larger input signal, e.g. y=signal[0:1600], the predictions are much better.

How do I generate good quality mfcc for an input signal of 0.01 seconds or signal[0:160]?

Is it possible to write a function that takes three parameters: y, sr, and numpoints and generate the best possible mfcc by calculating the rest of the parameters automatically?

def mfcc(y:array, sr:int, numpoints:int):
    # calculate params    
    return librosa.feature.mfcc(...)

0

There are 0 best solutions below