how to create Flair Huggingface output to dataframe

512 Views Asked by At

I am new to huggingface and i working on Flair (NER) module which gives me below output:

from flair.data import Sentence
from flair.models import SequenceTagger

# load tagger
tagger = SequenceTagger.load("flair/ner-german-large")

# make example sentence
sentence = Sentence("George Washington ging nach Washington")

# predict NER tags
tagger.predict(sentence)

# print sentence
print(sentence)

# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)

Output

Span [1,2]: "George Washington"   [− Labels: PER (1.0)]
Span [5]: "Washington"   [− Labels: LOC (1.0)]

How can I covert this output into dataframe with possible columns as 'Token'(NER) and 'Token_Type'('ORG' or 'PER').

The sentence generated is of type data.sentence

1

There are 1 best solutions below

0
On

The entity in the for entity in sentence.get_spans('ner') part of your code is of type flair.data.Span and has a lot of properties that you can use (you can see source code of the Span class at https://github.com/flairNLP/flair/blob/master/flair/data.py).

import pandas as pd

entities = []

for entity in sentence.get_spans('ner'):
    entities.append({
        'text': entity.text,
        'type': entity.tag,
        'score': entity.score
    })

print(pd.DataFrame(entities))

>>>
                text type     score
0  George Washington  PER  0.999997
1         Washington  LOC  0.999996