We have a custom torch.autograd.Function z(x, t) which computes an output y in a way not amenable to direct automatic differentiation, and have computed the Jacobian of the operation with respect to its inputs x and t, so we can implement the backward method.
However, the operation involves making several internal calls to a neural network, which we have implemented for now as a stack of torch.nn.Linear objects, wrapped in net, a torch.nn.Module. Mathematically, these are parameterized by t.
Is there any way that we can have net itself be an input to the forward method of z? Then, we would return from our backward the list of products of the upstream gradient Dy and parameter Jacobia dydt_i, one for each of the parameters ti that are children of net (in addition to Dy*dydx, although x is data and does not need gradient accumulation).
Or do we really instead need to take t (actually a list of individual t_i), and reconstruct internally in z.forward the actions of all the Linear layers in net?
I guess you could create a custom functor that inherits
torch.autograd.Functionand make theforwardandbackwardmethods non-static (i.e remove the@staticmethodin this example so thatnetcould be an attribute of your functor. that would look likeThis will yield a warning that you are using a deprecated legacy way of creating autograd functions (because of the non-static methods), and you need to be extra careful with zeroing the gradients in
netafter having backpropagated. So not really convenient, but I am not aware of any better way to have a stateful autograd function.