How to log the graph from fairseq to tensorboard

1k Views Asked by At

I got the impression from reading tutorials that this would be easy: slap in a call to add_graph().

First tricky part was working out where to put it. I've gone with fairseq_cli/train.py at the very top of the enumerate() loop in train():

for i, samples in enumerate(progress):

    if i == 0:
        # Output graph for tensorboard
        writer = progress._writer("")  #The "" is tag
        writer.add_graph(trainer._model, samples)
        writer.flush()

I'm passing --tensorboard-logdir mydir/ into the call to fairseq-train. That causes a TensorboardProgressBarWrapper wrapper around SimpleProgressBar (or whatever logging format you use), and so I'm trying to re-use the writer instance. (Maybe that is my mistake?)

I can get the model out of the trainer object, and the other thing we need for add_graph is some data, which is why I've put it in the above loop, so I can use samples. I've also tried samples[0] with exactly the same error message.

Which is:

Tracer cannot infer type of ({'id': tensor([743216, 642485,  92182, 793806, 494734, 275334,  53282, 449572,   1758,
  ...
  20734, 469070, 678489, 473213]), 'nsentences': 832, 'ntokens': 29775, 'net_input': {'src_tokens': tensor([[  225,    30,   874,  ...,  1330,    84,     2],
  [  442,   734,  7473,  ...,    38,     5,     2],
  ...,
  [ 6238, 11411,   428,  ..., 10387,     5,     2]]), 'src_lengths': tensor([21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21,
  ...
  21, 21, 21, 21]), 'prev_output_tokens': tensor([[   2,   22,  133,  ..., 4445,   46,    1],
  ...,
  [   2,   22,   59,  ...,  112,    6,    1]])}, 'target': tensor([[ 22, 133, 203,  ...,  46,   2,   1],
  [132, 429,  40,  ...,  46,   2,   1],
  ...,
  [ 22,  59, 177,  ...,   6,   2,   1]])},)
:Dictionary inputs to traced functions must have consistent type. Found Tensor and int
Error occurs, No graph saved

I'm hoping someone might understand that error message better than me? Tips on better places to put the call to add_graph(), or better ways to approach this whole thing (getting a visual representation of my model), are also welcome. (PyTorchViz hasn't been updated for over 2 years, and apparently won't work with latest pytorch.)

Tensorboard logging of the training is working, by the way.

ADDITIONAL

I poked around some more, and found it is going to call forward() of my model, which (ignoring optional args), looks like:

def forward(self, src_tokens, src_lengths, prev_output_tokens):

so I tried manually creating empty data:

dummy_data = {'src_tokens':torch.zeros((1,256)),'src_lengths':torch.zeros((1,256)),'prev_output_tokens':torch.zeros((1,256))}
writer.add_graph(trainer._model,dummy_data,verbose=True)

That fails with:

TypeError: forward() missing 2 required positional arguments: 'src_lengths' and 'prev_output_tokens'

And if I use kwargs-style:

dummy_data = {'src_tokens':torch.zeros((1,256)),'src_lengths':torch.zeros((1,256)),'prev_output_tokens':torch.zeros((1,256))}
writer.add_graph(trainer._model,**dummy_data,verbose=True)

it fails with:

TypeError: add_graph() got an unexpected keyword argument 'src_tokens'

I'm starting to think this may be a limitation in add_graph() and it will only work with a simple def forward(self, x): ?

0

There are 0 best solutions below