I have used BERT
with HuggingFace
and PyTorch
and used DataLoader
, Serializer
for Training & Evaluation. Below is the code for that:
! pip install transformers==3.5.1
from transformers import AutoModel, BertTokenizerFast
bert = AutoModel.from_pretrained('bert-base-uncased')
tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')
def textToTensor(text,labels=None,paddingLength=30):
tokens = tokenizer.batch_encode_plus(text.tolist(), max_length=paddingLength, padding='max_length', truncation=True)
text_seq = torch.tensor(tokens['input_ids'])
text_mask = torch.tensor(tokens['attention_mask'])
text_y = None
if isinstance(labels,np.ndarray): # if we do not have y values
text_y = torch.tensor(labels.tolist())
return text_seq, text_mask, text_y
text = test_df['text'].values
seq,mask,_ = textToTensor(text,paddingLength=35)
data = TensorDataset(seq,mask)
dataloader = DataLoader(data,batch_size=1)
for step,batch in enumerate(dataloader):
batch = [t.to(device) for t in batch]
sent_id, mask = batch
with torch.no_grad():
print(np.argmax(model(sent_id, mask).detach().cpu().numpy(),1))
It gives me a numpy array
as a result and since the batch_size=1
and No Serializer
is used in this one, I am getting results as single array number as class prediction.
I have two questions:
Are the results strictly according to the index of df['text']
?
**How can I get the predictions for a single sentence like Hello make my prediction. I am a single sentence?
Can someone please help me making a single prediction?