Finetune Wav2vec2 for downstream speech classification

251 Views Asked by At

I want to finetune a wav2vec2 model by adding some more custom layers of my own on top of wav2vec for downstream task. Is there any easier way to do this, like just calling the the model without pretrained weights and add my layers in a pytorch object and then train the model as a whole for speech classification on my own dataset. Basically I want to have a model which is (wav2vec2 + my_layers) and train it on my data.

I tried to look into available github repositories, most of them do not train the whole model and just get embeddings from pretrained model and finetune the downstream model. But I want to train the upstream as well. I also tried with toolkit such as s3prl but there were several errors as lot of files were involved, as I am relatively new to Speech classification, I am not sure how to correct all those errors.

0

There are 0 best solutions below