I'm running textsum decoding on a small test set (5 examples), but both the reference and decode files are already thousands of lines long. Is there a reason decoding runs seemingly indefinitely? Is it processing the same set of examples repeatedly? Are later outputs supposed to be better than earlier ones?
Would love some intuition on this; I haven't been able to find a clear explanation.
Yes, you are correct in your thought, the same input is repeatedly used to generate output. However, you can limit the output to just one. I did it a while ago by modifying seq2seq_attention_decode.py where output is written to file. I was giving only one input, so after just one output, I wanted it to stop.
Is there a reason decoding runs seemingly indefinitely?: Its just my intuition that it is expected to give different summary in different runs. When decode is run on a machine different from the one on which it is trained, it should use generate new models and therefore give a different output. Probably it would have been a way to monitor the change in output as the training process continues.