Reusing parameters when training a new stanford-pos model

34 Views Asked by At

When training a new model using:

java -mx1g edu.stanford.nlp.tagger.maxent.MaxentTagger -props myPropertiesFile.prop

Suppose the model specified in myPropertiesFile.prop already exists. Is a new model trained from scratch or is it trained starting with the existing parameters? Can I gain control over what is done in this situation?

Some context:

I would like to first train the tagger on a very large corpus of not so accurately tagged data and then continue training on a much smaller corpus of accurate data (a so called warm start)

1

There are 1 best solutions below

0
On

It will build a new model from scratch. To the best of my knowledge there is no functionality for training the model on one data set, and then continuing training on a different data set. You could possibly modify the code to take in initial features and weights and then start training from there, but it's not set up to do this easily.