Is it possible to update existing text classification model in tensorflow?

528 Views Asked by At

I have been performing text classification with tensorflow. I would like to know if this text classification model could be updated with every new data that I might acquire in future so that I would not have to train the model from scratch. Also, sometimes with time, the number of classes might also be more since I am mostly dealing with customer data. Is it possible to update this existing text classification model with data containing more number of classes by using the existing checkpoints?

1

There are 1 best solutions below

4
On

Given that you are asking 2 different question I'm now answering both separately:

1) Yes, you can continue the training with the new data you have acquired. This is very simple, you just need to restore your model as you do now to use it. Instead of running some placeholder like outputs, or prediction, you should run the optimizer operation. This translates into the following code:

model = build_model() # this is the function that build the model graph
saver = tf.train.Saver()
with tf.Session() as session:
    saver.restore(session, "/path/to/model.ckpt")
########### keep training #########
data_x, data_y = load_new_data(new_data_path)
for epoch in range(1, epochs+1):
    all_losses = list()
    num_batches = 0
    for b_x, b_y in batchify(data_x, data_y)
        _, loss = session.run([model.opt, model.loss], feed_dict={model.input:b_x, model.input_y : b_y}
        all_losses.append(loss * len(batch_x))
        num_batches += 1
    print("epoch %d - loss: %2f" % (epoch, sum(losses) / num_batches))

note that you need to now the name of the operations defined by the model in order to run the optimizer (model.opt) and the loss op (model.loss) to train and monitor the loss during training.

2) If you want to change the number of labels you want to use then it is a bit more complicated. If your network is 1 layer feed forward then there is not much to do, because you need to change the matrix dimensionality then you need to retrain everything from scratch. On the other hand, if you have for example a multi-layer network (e.g. an LSTM + dense layer that do the classification) then you can restore the weights of the old model and just train from scratch the last layer. To do that i recommend you to read this answer https://stackoverflow.com/a/41642426/4186749