Sidekit. What is feature_list in sidekit.EM_split()?

140 Views Asked by At

I am new to using sidekit for speaker recognition, I encountered a problem where I need to use "feature_list".feature_list=ubm_list but what's in feature list?It says it contains list of feature files to train a GMM with. But what is suppose to be in the feature files.

ubm=sk.Mixture()
ubm_list="/home/david/Documents/development_set/anthonyschaller-20071221-/list"
ubm.EM_split(features_server=server,feature_list=ubm_list,
            distrib_nb=1024,iterations=(1,2,2,4,4,4,4,8,8,8,8,8,8),
            num_thread=10,llk_gain=0.01,save_partial=False,ceil_cov=10,
            floor_cov=1e-2)
1

There are 1 best solutions below

0
On

Yeah, you are right.. ubm_list is a list of features files which probably have .h5 extension. So, your ubm_list should be:

import os

feat_dir = "/home/david/Documents/development_set/anthonyschaller-20071221-/list"
ubm_list = os.listdir(feat_dir)

And according to your second question, the feat_dir should contain the features files of HDF5-format (files that have .h5 extension). YOu can open one of these files using h5py module to explore. I did that with one of mine, and here is what I found out:

>>> import h5py
>>>
>>> hf = h5py.File('/media/anwar/SIDEKIT-1.3/feat/S01.h5', 'r')
>>> hf.keys()
<KeysViewHDF5 ['S01.wav', 'compression']>

>>> # explore the second key 'compression'
>>> k2 = hf.get('compression')
>>> type(k2)
<class 'h5py._hl.dataset.Dataset'>

>>> #explore the first key 'S01.wav'
>>> k1 = hf.get('S01.wav')
>>> k1.keys()
<KeysViewHDF5 ['cep', 'cep_header', 'cep_mean', 'cep_min_range', 'cep_std',
 'energy', 'energy_header', 'energy_mean', 'energy_min_range', 'energy_std',
 'fb', 'fb_header', 'fb_mean', 'fb_min_range', 'fb_std', 'vad']>

The following image is from the the documentation of a former-version where you can find all the information I mentioned above with minor changes: enter image description here