I am currently trying to implement NER model using sklearn_crfsuite
library.
The training code is simply as follows:
for repeat in range(10):
crf = sklearn_crfsuite.CRF(
algorithm='lbfgs',
c1=0.1,
c2=0.1,
max_iterations=100,
all_possible_transitions=True,
verbose=True
)
crf.fit(X_train, y_train)
pred_list = crf.predict(X_test)
The code is do training for ten repeat, my goal is to observe 10 different scores and average them as a final score. However, each repeat gives the same score, although I reinitialize the model in each loop.
The question is, how I properly set random seed so that each repeat can give different results?
NOTE: After I shuffle the training data in each loop, it still gives the same results. Finally, I changed the training algorithm from
'lbfgs'
(Gradient descent using the L-BFGS method) to'l2sgd'
(Stochastic Gradient Descent with L2 regularization), then I started to obtain different results.
You don't search for a random seed, you probably search for cross validation:
the full documentation you can find here.
if you want to run 10 different iterations you can use:
and you will get the best parameters