I'm a beginner, and I recently trained a pre-trained language model from scratch using my own data. I want to evaluate it on the GLUE benchmark, but there are a few questions that confuse me.
I used a roberta model, which has an mlm(Masked Language Model) structure. I noticed that GLUE tasks are classification tasks. Does this mean that I need to add a classification head to my pretrained model for each task and fine-tune it on the GLUE test dataset?
I find it quite time consuming to finetune each task in GLUE. I wonder if there is any workaround?
Thanks!