Use pre-trained transformer model to embed word, definition pair

42 Views Asked by At

I'm looking to embed words, given their definitions, with a pre-trained transformer model. For instance, 'Orange : the color between yellow and red on the spectrum of visible light' should not give the same embedding as 'Orange : a large round juicy citrus fruit with a tough bright reddish-yellow rind.'.

The context of this question is to generate candidates to then perform named entity disambiguation.

I have tried BertModel, but I got CUDA out of memory errors even with size 1 batches. RobertaModel does not yield the same computation difficulties, but I have read that it is not as effective if I am not fine-tuning it on a downstream task.

Do you have a transformer model in particular to recommend?

PS: The code I am using (here model is RobertaModel):

inputs = {key: value.to('cuda') for key, value in batch.items() if isinstance(value, torch.Tensor)}
outputs = model(**inputs)
# Get the last hidden states (embeddings)
cls_embedding = outputs.last_hidden_state[:,0,:]
cls_embedding = cls_embedding.to('cpu')

# Normalize the embedding
normalized_embedding = torch.nn.functional.normalize(cls_embedding, p=2, dim=1)
0

There are 0 best solutions below