How can I train distilBERT more efficiently on my large text classification task?

218 Views Asked by At

I've been thrown into the deep end a bit with a task at work. I need to use DistilBERT for a multi-class text classification problem, but here's the kicker the dataset is gigantic - we're talking millions of samples!

I've been messing around with it, and DistilBERT does seem to do the job well. However, training takes forever So, here are my dilemmas:

Model Training: How can I make DistilBERT handle this beast of a dataset more efficiently? Anyone got experience tweaking the training strategy, batch size, learning rate, etc.? Hardware Constraints: Any hardware magic tricks to pull off? Is splurging on a fancy GPU the only way, or are there some tricks I don't know about? Inference Speed: I also need to make sure the model can quickly classify new data after training. What are my options?

Any help would be a lifesaver!

0

There are 0 best solutions below