I am trying to learn the use of BERT. Here is the code:
from sklearn.datasets import fetch_20newsgroups
data = fetch_20newsgroups(subset='all')['data']
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('distilbert-base-nli-mean-tokens')
embeddings = model.encode(data, show_progress_bar=True)
The problem is that it is incredibly slow: 24-48 hours to complete.
I have macOS M1 Pro notebook. What can be done to speed-up the process?
Thank you