How to detect if textsum training is overfitting?

178 Views Asked by At

I am using Tensorflow 0.9 and training with the Textsum model. I have about 1.3 million articles that I scraped and have been training against them for about a week now. The average loss was about 1.75 - 2.1. I decided to stop and run eval as it is my understanding that my avg loss should be close to what I get with training. When I ran the eval I am seeing 2.6 to 2.9 average loss. I was just wondering what I should expect to see when performing this run.

Am I using this training/eval analysis correctly? I am somewhat new to deep learning and trying to use this as a way to learn and through some other reading, it seems that this may be a bit of a large spread between the two.

Is there a standard tolerance for evaluating against a different dataset and what the difference of average loss should be? At this point I'm not sure if I should keep training or stop here for now and try to figure out how to get this running in tensorflow serving. I don't want to over-fit the model, but from an academic standpoint, let's say I did over-fit via training. What would I need to do to "fix" it? Do you simply get more articles and feed in that data now as training or is the model essentially broke and unusable?

1

There are 1 best solutions below

0
Alexandre Passos On

There is no guarantee that the eval loss will match the training loss. Overfitting is hard to measure, but if the eval loss is increasing as the training proceeds that's a clear sign of it.