How do you get single embedding vector for each word (token) from RoBERTa?

1k Views Asked by Fatih Beyhan At 21 October 2025 at 06:45

As you may know, RoBERTa (BERT, etc.) has its own tokenizer and sometimes you get pieces of given word as tokens, e.g. embeddings » embed, #dings

Since the nature of the task I am working on, I need a single representation for each word. How do I get it?

CLEARANCE:

sentence: "embeddings are good" --> 3 word tokens given
output: [embed,#dings,are,good] --> 4 tokens are out

When I give sentence to pre-trained RoBERTa, I get encoded tokens. At the end I need representation for each token. Whats the solution? Summing embed + #dings tokens point-wise?

Original Q&A

There are 1 best solutions below

Crystina On 01 February 2021 at 04:43

I'm not sure if there is standard practice, but what I saw the others have done is to simply take the average of the sub-tokens embeddings. example: https://arxiv.org/abs/2006.01346, Section 2.3 line 4

How do you get single embedding vector for each word (token) from RoBERTa?

There are 1 best solutions below

Related Questions in WORD-EMBEDDING

Related Questions in BERT-LANGUAGE-MODEL

Related Questions in PRE-TRAINED-MODEL

Related Questions in ROBERTA

Trending Questions

Popular # Hahtags

Popular Questions