when I try to pass the maximum activation value from previous layer to normalize the input of relu in next layer I encounter a runtime error as below. However, when I pass fixed value it works well without any error.
File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 175, in backward allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: Trying to backward through the graph a second time (or directly access saved
tensors after they have already been freed). Saved intermediate values of the graph are freed
when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to
backward through the graph a second time or if you need to access saved tensors after calling
backward.
As you see in this code below, I pass the argument prev_layer_max
from the previous layer and encounter the error:
class th_norm_ReLU(nn.Module):
def __init__(self, modify):
super(th_norm_ReLU, self).__init__()
self.therelu = F.relu
def forward(self, input, prev_layer_max):
output = input * (prev_layer_max / input.max())
norm_output = self.therelu (output)
return norm_output
But if I use a fixed value instead of passed prev_layer_max
argument, as this code below I make it equal to 1
it works normally without any error:
def forward(self, input, prev_layer_max = 1):
output = input * (1 / input.max())
norm_output = self.therelu (output)
the training loop is as below :
for epoch in range(params.epochs):
running_loss = 0
start_time = time.time()
for i, (images, labels) in enumerate(train_loader):
model.train()
model.zero_grad()
optimizer.zero_grad()
labels.to(device)
images = images.float().to(device)
outputs = model(images, epoch)
loss = criterion(outputs.cpu(), labels)
running_loss += loss.item()
loss.backward()
optimizer.step()
here is the forward in the model where I record the max of each layer in a list ( thresh_list ):
def forward(self, input, epoch):
x = self.conv1(input)
x = self.relu(x,1)
self.thresh_list[0] = max(self.thresh_list[0], x.max()) #to get the max activation
x = self.conv_dropout(x)
x = self.conv2(x)
x = self.relu(x, self.thresh_list[0])
self.thresh_list[1] = max(self.thresh_list[1], x.max())
x = self.pool1(x)
x = self.conv_dropout(x)
x = self.conv3(x)
x = self.relu(x, self.thresh_list[1] )
self.thresh_list[2] = max(self.thresh_list[2], x.max())
The Relue function I call is :
self.relu = th_norm_ReLU(True)
and the_norm_ReLU model is shown above.