I'm getting the following error message: the numbers of items and labels differ: |x| = 14, |y| = 7 when I want to add examples to my crf training.
When I train my crf on my data the first time everything is fine. My X is a list of dictionary lists (which represent word features) and my y is a list of label lists.
X = [sent2features(s) for s in result]
y = [sent2labels(s) for s in result]
X_initial = X[0:10]
y_initial = y[0:10]
X_pool = X[10:1000]
y_pool = y[10:1000]
X_test = X[1000:2000]
y_test = y[1000:2000]
liste_aplatie = list(chain.from_iterable(y))
classe_unique = set(liste_aplatie)
#Initialisation du modèle crf
crf = sklearn_crfsuite.CRF(
algorithm='lbfgs',
c1=0.1,
c2=0.1,
max_iterations=20,
all_possible_transitions=True
)
But then I'd like to do some active learning and so I use modal python which has a teach function whose purpose is to learn new examples (query_idx). And then I get my error message. Do you have a solution?
learner = ActiveLearner(
estimator=wrapper(),
X_training=X_initial, y_training=y_initial)
query_idx = learner.estimator.uncertainty_sampling(X_pool,n_instances=10)
for elt in query_idx:
learner.teach(
X=X_pool[elt], y=y_pool[elt], only_new=True
)```