Keep alignments in Named Entity Recognition tasks after cleaning text

229 Views Asked by RobinHood At 03 April 2021 at 12:22

I am working on a Named Entity Recognition (NER) task and the entities are annotated in BRAT format (.txt + .ann). I have implemented some regular expressions to clean the texts before using my model, but if I modify the text I have to align the entities' offsets of the annotations. This task is relatively straightforward and after this, I can use my NLP model to classify the different entity classes. However, once I get the classification of the model I need to re-align the recognized entities in the original text, i.e. change the offsets of the cleaned text to those I had before the use of regular expressions. Is there a way to keep track of the original offsets after cleaning texts?

Original Q&A

Keep alignments in Named Entity Recognition tasks after cleaning text

There are 0 best solutions below

Related Questions in NLP

Related Questions in TEXT-MINING

Related Questions in DATA-CLEANING

Related Questions in NAMED-ENTITY-RECOGNITION

Related Questions in BRAT

Trending Questions

Popular # Hahtags

Popular Questions