I've got a bunch of equations using sums, multiplication and min(x,0) or max(x,0) that yield a result (one output, 18 inputs).
I'm trying to have an NN model in pytorch learn these so I generate quick results.
I generated 30k random X-Y pairs in excel (just using RND()*100-50 for X and calculating Y). I uploaded the pairs with pandas and wrote an NN with ReLu (which I hoped would handle the non-linearity). Here's the net:
class MyModel(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super().__init__()
self.flatten = nn.Flatten() # Flatten input data
self.hidden_layer = nn.Sequential(
nn.Linear(input_size, hidden_size),
nn.Linear(input_size, hidden_size),
nn.ReLU(),
nn.BatchNorm1d(hidden_size),
nn.Linear(input_size, hidden_size),
nn.Linear(input_size, hidden_size),
nn.ReLU()
)
self.output_layer = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.flatten(x)
x = self.hidden_layer(x)
x = self.hidden_layer(x)
x = self.hidden_layer(x)
output = self.output_layer(x)
return output
sizes are 18 for inputs and hidden layer and 1 for output.
Can't converge, left with quite a big error. Thought that'd be a simple task for an NN, to learn that set of equations, there's no noise or anything. What can I do to make this work?
Your
nn.Sequentialsetup doesn't make sense.nn.Sequentialruns the model modules in the order listed. Yours:Has linear layers back to back, which is redundant since the composition of two linear layers is still a linear layer. Your sizes don't line up. The first layer maps an input of size
input_sizetohidden_size, but your second layer expects the input to be of sizeinput_size. This works for you currently because you are using the same size for input and hidden, but this will throw an error if that is ever not the case.You want something like this:
That example has two blocks of linear/relu/batchnorm. You can add more if you want.
Your
forwardmethod is also weird.First, make sure
nn.Flattenis doing what you expect. Check the input/output shapes to be sure.Second, you apply the same block of layers three times. If you want more layers, you should add them to the
nn.Sequentialblock instead of passing different activations through the same layers 3 times.