Linear autoencoder using Pytorch

1.9k Views Asked by At

How do we build a simple linear autoencoder and train it using torch.optim optimisers?

How do I do it using autograd (.backward()) and optimising the MSE loss, and then learn the values of the weights and biases in the encoder, and the decoder (ie. 3 parameters in the encoder and 4 in the decoder)? And the data has to be randomized, for each run of learning, start from random weights and biases, such as:

wEncoder = torch.randn(D,1, requires_grad=True)
wDecoder = torch.randn(1,D, requires_grad=True)
bEncoder = torch.randn(1, requires_grad=True)
bDecoder = torch.randn(1,D, requires_grad=True)

The target optimizer is SGD, learning rate 0.01, no momentum, and 1000 steps (from a random start), then how do we plot loss versus epochs (steps)?

I tried this but the losses are the same for every epoch.

D = 2
x = torch.rand(100,D)
x[:,0] = x[:,0] + x[:,1]
x[:,1] = 0.5*x[:,0] + x[:,1]

loss_fn = nn.MSELoss()
optimizer = optim.SGD([x[:,0],x[:,1]], lr=0.01)
losses = []
for epoch in range(1000):
    running_loss = 0.0
    inputs = x_reconstructed
    targets = x
    loss=loss_fn(inputs,targets)
    loss.backward(retain_graph=True)
    optimizer.step()
    optimizer.zero_grad()
    running_loss += loss.item() 
    epoch_loss = running_loss / len(data)
    losses.append(running_loss)
2

There are 2 best solutions below

1
On BEST ANSWER

This example should get you going. Please see code comments for further explanation:

import torch


# Use torch.nn.Module to create models
class AutoEncoder(torch.nn.Module):
    def __init__(self, features: int, hidden: int):
        # Necessary in order to log C++ API usage and other internals
        super().__init__()
        self.encoder = torch.nn.Linear(features, hidden)
        self.decoder = torch.nn.Linear(hidden, features)

    def forward(self, X):
        return self.decoder(self.encoder(X))

    def encode(self, X):
        return self.encoder(X)

# Random data
data = torch.rand(100, 4)
model = AutoEncoder(4, 10)
# Pass model.parameters() for increased readability
# Weights of encoder and decoder will be passed
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
loss_fn = torch.nn.MSELoss()

# Per-epoch losses are gathered
# Loss is the mean of batch elements, in our case mean of 100 elements
losses = []
for epoch in range(1000):
    reconstructed = model(data)
    loss = loss_fn(reconstructed, data)
    # No need to retain_graph=True as you are not performing multiple passes
    # of backpropagation
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

    losses.append(loss.item())

Please notice linear autoencoder is roughly equivalent to PCA decomposition, which is more efficient.

You should probably use a non-linear autoencoder unless it is simply for training purposes.

0
On

We could simply use nn.Sequential() too, e.g., with the following code snippet:

import torch
encoded_dim = 32
encoder = torch.nn.Sequential(
                      torch.nn.Flatten(),
                      torch.nn.Linear(28*28, 256),
                      torch.nn.Sigmoid(),
                      torch.nn.Linear(256, 64),
                      torch.nn.Sigmoid(),
                      torch.nn.Linear(64, encoded_dim)
)
decoder = torch.nn.Sequential(
                      torch.nn.Linear(encoded_dim, 64),
                      torch.nn.Sigmoid(),
                      torch.nn.Linear(64, 256),
                      torch.nn.Sigmoid(),
                      torch.nn.Linear(256, 28*28),
                      torch.nn.Sigmoid(),
                      torch.nn.Unflatten(1, (28,28))
)
autoencoder = torch.nn.Sequential(encoder, decoder)
autoencoder 
# Sequential(
# (0): Sequential(
#  (0): Flatten(start_dim=1, end_dim=-1)
#  (1): Linear(in_features=784, out_features=256, bias=True)
#  (2): Sigmoid()
#  (3): Linear(in_features=256, out_features=64, bias=True)
#  (4): Sigmoid()
#  (5): Linear(in_features=64, out_features=32, bias=True)
#  )
# (1): Sequential(
#  (0): Linear(in_features=32, out_features=64, bias=True)
#  (1): Sigmoid()
#  (2): Linear(in_features=64, out_features=256, bias=True)
#  (3): Sigmoid()
#  (4): Linear(in_features=256, out_features=784, bias=True)
#  (5): Sigmoid()
#  (6): Unflatten(dim=1, unflattened_size=(28, 28))
# )
#)

Example training with MNIST data

Load data (MNIST) with torchvision:

train_loader = torch.utils.data.DataLoader(
                    torchvision.datasets.MNIST('./data', train=True, download=True,
                             transform=torchvision.transforms.Compose([
                               torchvision.transforms.ToTensor(),
                               # ...
                             ])),
                    batch_size=64, shuffle=True)

Now, let's train the autoencoder model, the optimizer used is Adam, although SGD could be used as well:

loss_fn = torch.nn.BCELoss()
optimizer = torch.optim.Adam(autoencoder.parameters(), lr=1e-3, weight_decay=1e-5)
for epoch in range(10):
    for idx, (x, _) in enumerate(train_loader):
      x = x.squeeze()
      x = x / x.max()
      x_pred = autoencoder(x) # forward pass
      loss = loss_fn(x_pred, x)
      if idx % 1024 == 0:
        print(epoch, loss.item())
      optimizer.zero_grad()
      loss.backward()         # backward pass
      optimizer.step()

# epoch  loss
# 0 0.702496349811554
# 1 0.24611620604991913
# 2 0.20603498816490173
# 3 0.1827092468738556
# 4 0.1805819869041443
# 5 0.16927748918533325
# 6 0.17275433242321014
# 7 0.15827134251594543
# 8 0.1635081171989441
# 9 0.15693898499011993

The following animation shows the reconstruction of a few randomly selected images by the autoencoder at different epochs, notice how the reconstruction for the MNIST digits gets better with more and more epochs:

enter image description here