I trained my NER model first with Spacy with the micro F1 score 64.7% (8 classes). Next step I wanted to train Flair, hoping to get better results. Of course the spacy format data would be convert to proper Flair corpus with some custom codes.
Info about the input data: Corpus: "Corpus: 4037 train + 840 dev + 448 test sentences"
in training set: 'Kultur' (1512), 'Erreger' (1376), 'Mittel' (1083), 'Auftreten' (583), 'Zeit' (285), 'Witterung' (238), 'BBCH_Stadium' (214), 'Ort' (161)
in test set: 'Erreger' (390), 'Mittel' (311), 'Kultur' (221), 'BBCH_Stadium' (148), 'Auftreten' (54), 'Witterung' (54), 'Ort' (53), 'Zeit' (40)
Corpus look like this:
Der O
Schwerpunkt O
der O
Unkrautbekämpfung O
in O
Kartoffeln S-Kultur
liegt O
im O
Vorauflauf S-BBCH_Stadium
. O
Sind O
die O
mechanischen O
Maßnahmen O
abgeschlossen O
, O
kann O
die O
erste O
Herbizidbehandlung O
auf O
gut O
abgesetzten O
Dämmen O
, O
je O
nach O
Produkt O
bis O
kurz B-BBCH_Stadium
vor I-BBCH_Stadium
dem I-BBCH_Stadium
Durchstoßen E-BBCH_Stadium
der O
Kartoffeln S-Kultur
( O
kvD O
) O
, O
erfolgen O
. O
The training code:
from flair.datasets import ColumnCorpus
from flair.embeddings import TokenEmbeddings, WordEmbeddings, StackedEmbeddings, FlairEmbeddings
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer
from typing import List
import time
start_time = time.time()
columns = {0: 'text', 1: 'ner'}
data_path = "path/to/data"
# initializing the corpus
corpus: Corpus = ColumnCorpus(data_path, columns,
train_file = 'bb_train.txt',
# dev_file = 'bb_test_sm_sm.txt',
test_file = 'bb_test_sm_sm.txt',
)
# tag to predict
tag_type = 'ner'
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
word_vectors = gensim.models.KeyedVectors.load_word2vec_format('german.model', binary=True)
word_vectors.save('german.model.gensim')
german_embedding = WordEmbeddings('german.model.gensim')
# init forward embedding for German
flair_embedding_forward = FlairEmbeddings('de-forward')
flair_embedding_backward = FlairEmbeddings('de-backward')
embedding_types: List[TokenEmbeddings] = [
german_embedding,
flair_embedding_forward,
flair_embedding_backward,
]
embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)
tagger: SequenceTagger = SequenceTagger(hidden_size=256,
embeddings=embeddings,
tag_dictionary=tag_dictionary,
tag_type=tag_type,
use_crf=True)
trainer : ModelTrainer = ModelTrainer(tagger, corpus)
trainer.train('resources/taggers/ner_bb',
learning_rate=0.01,
mini_batch_size=64,
max_epochs=5,
)
print(f"It took {time.time() - start_time}")
the loss log is:
EPOCH TIMESTAMP BAD_EPOCHS LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1
1 10:42:23 0 0.0100 42.28642028570175 28.403223037719727 0.0197 0.0748 0.0312
2 10:43:48 0 0.0100 17.928552985191345 14.348283767700195 0.3089 0.0312 0.0567
3 10:45:10 0 0.0100 10.604630261659622 13.98863697052002 0.3089 0.0312 0.0567
4 10:46:36 1 0.0100 10.26459190249443 13.614569664001465 0.3579 0.0279 0.0518
5 10:47:55 2 0.0100 9.987788125872612 13.339178085327148 0.3333 0.0164 0.0313
Why the score is so low ? the model doesn't learn anything. I tried until 10 epochs and same results.
Do I need to tune some parameters ? Is something wrong with my corpus ?
Thank you if you have experience with it.