Questions about programming a cnn with PyTorch

173 Views Asked by At

I'm pretty new at programming cnn so I'm a little bit lost. I'm trying to do this part of the code, where they ask me to implement a fully-connected network to classify the digits. It should contain 1 hidden layer with 20 units. I should use ReLU activation function on the hidden layer.

class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.fc1 = ... 
        
        self.fc2 = nn.Sequential(
            nn.Linear(500,10),
            nn.Softmax(dim = 1)
            )
        
    def forward(self, x):
        x = x.view(x.size(0),-1)
        x = self.fc1(x)
        x = self.fc2(x)
        return x

The dots are the part to fill, I think about this line:

self.fc1 = nn.Linear(20, 500)

But I don't know if it's correct. Could someone help me please? And I don't understand at all what the function Softmax do... so if someone knows it please. Thank you so much!!

Pd. This is the code to load the data:

batch_size = 64
trainset = datasets.MNIST('./data', train=True, download=True, transform=transforms.ToTensor())
train_loader = DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=1)
testset = datasets.MNIST('./data', train=False, download=True, transform=transforms.ToTensor())
test_loader = DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=1)
2

There are 2 best solutions below

4
On

From the code given for the model, it can be seen that the hidden layer has 500 units. So I am assuming you meant 20 units for input. With this assumption, the code must be:

self.fc1 = nn.Sequential(
    nn.Linear(20, 500),
    nn.ReLU()
    )

Coming to the next part of your question, given that you are working with MNIST dataset and you have the softmax function, I am assuming you are trying to predict the number present in the images. Your neural network performs various multiplication and addition operations in each layer and finally, you end up with 10 numbers in the output layer. Now, you have to make sense of these 10 numbers to decide which of the 10 digits is given in the image.

One way to do this would be to select the unit which has the maximum value. For example if the 10th unit has the maximum value among all units, then we conclude that the digit is '9'. If the 2nd unit has the maximum value, then we conclude that the digit is '1'.

This is fine but a better way would be to convert the values of each of the units to probability that the corresponding digit is contained in the image and then we choose the digit having highest probability. This has certain mathematical advantages which helps us in defining a better loss function.

Softmax is what helps us to convert the values to probabilities. On applying softmax, all the values lie in the range (0, 1) and they sum up to 1.

If you are interested in deeplearning and the math behind it, I would suggest you to checkout Andrew NG's course on deeplearning.

0
On

You did not mention the shape of your data so I'll be assuming the expected shape returned by datasets.MNIST.

Data shape: torch.Size([64, 1, 28, 28])

class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.fc1 = nn.Sequential(
            nn.Linear(1*28*28, 20),
            nn.ReLU())
        
        self.fc2 = nn.Sequential(
            nn.Linear(500,10),
            nn.Softmax(dim = 1))
        
    def forward(self, x):
        x = x.view(x.size(0), -1)
        x = self.fc1(x)
        x = self.fc2(x)
        return x

The first argument of nn.Linear is the size of input feature while the second is the number of units.

For self.fc1, the size of the input feature is the multiplication of your data shape except the batch size, which is 1 * 28 * 28. And as per your post the second argument should be 20 (20 units).

The shape of the output from self.fc1 (which is also the input to self.fc2) will then be (batch size, 20).

For self.fc2, the size of the input feature will be 20 while the number of units (which is also the number of digits) will be 10.