GMM and MFCC for language identification

201 Views Asked by At

I am new to machine learning domain. Currently, I am trying to implement a audio language detection system, based on MFCC, delta, delta delta and Mel Spectrum Coefficients of any audio file. These features are extracted using librosa. Librosa returns a 2D matrix of MFCC's. The problem is that I want to train them on a Gaussian Mixture Model. The Sci-kit library takes the input in the format (n_samples, n_features), but I have a D matrix of the form (n_samples, n_mfcc, n_time) as returned by librosa.features.mfcc(). How can i provide a 3D input to a GMM?

Also is there a way so that I can send all the 4 features mentioned above into the model?

0

There are 0 best solutions below