Unsupervised Learning Autoencoder multiple output CNN for Feature Extraction on MNIST data

Question

Unsupervised Learning Autoencoder multiple output CNN for Feature Extraction on MNIST data

230 Views Asked by Marie C. At 17 August 2025 at 12:13

I want to get features from a dataset (here MNIST) to put them into a cluster algorithm like k-means. The feature generation should be unsuperviesed, so I choose autoencoding to do it. I want that the Autoencoder is only trained by the loss in the second output layer 'output_layer2' and after the training I want to get the features from the first output layer 'output_layer1'. To get two output layer I choose funtional API in keras. But I don't know how to get the features and how to train only on one loss. Furthermore I get error massages like the ones under my code:

import numpy as np
import sys

import sklearn
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Input, Reshape, Conv2D, MaxPool2D, Conv2DTranspose
from tensorflow.keras.optimizers import SGD

assert sys.version_info >= (3, 5)
# Scikit-Learn ≥0.20 is required
assert sklearn.__version__ >= "0.20"
# TensorFlow ≥2.0-preview is required
assert tf.__version__ >= "2.0"


def get_MNISTdata(data_sz=10000):
    (X_train_full, y_train_full), (
        X_test,
        y_test,
    ) = keras.datasets.mnist.load_data()
    X_train_full = X_train_full.astype(np.float32) / 255
    X_test = X_test.astype(np.float32) / 255
    data_tr = X_train_full.shape[0] - data_sz
    X_train, X_valid = X_train_full[:-data_tr], X_train_full[-data_tr:]
    y_train, y_valid = y_train_full[:-data_tr], y_train_full[-data_tr:]
    return X_train, X_valid, y_train, y_valid


def rounded_accuracy(y_true, y_pred):
    return keras.metrics.binary_accuracy(tf.round(y_true), tf.round(y_pred))


def convoluntional_autoencoder(
    X_train,
    X_valid,
    shape1=28,
    shape2=28,
    channels=1,
    initial_features=16,
    kernel_sz=3,
    padding1="SAME",
    padding2="VALID",
    activation1="selu",
    activation2="sigmoid",
    pool_sz=2,
    strides=2,
    loss1="binary_crossentropy",
    loss2="softmax",
    lr=1.0,
    epochs=3,
):

    tf.random.set_seed(42)
    np.random.seed(42)

    # encoder
    input_layer = Input(shape=(shape1, shape2, channels))
    Layer1 = Conv2D(
        initial_features,
        kernel_size=kernel_sz,
        padding=padding1,
        activation=activation1,
    )(input_layer)
    Layer2 = MaxPool2D(pool_size=pool_sz)(Layer1)
    Layer3 = Conv2D(
        initial_features * 2,
        kernel_size=kernel_sz,
        padding=padding1,
        activation=activation1,
    )(Layer2)
    Layer4 = MaxPool2D(pool_size=pool_sz)(Layer3)
    Layer5 = Conv2D(
        initial_features * 4,
        kernel_size=kernel_sz,
        padding=padding1,
        activation=activation1,
    )(Layer4)
    output_layer1 = MaxPool2D(pool_size=pool_sz)(Layer5)

    n_poolingLayer = 3

    # decoder
    Layer7 = Conv2DTranspose(
        initial_features * 2,
        kernel_size=kernel_sz,
        strides=strides,
        padding=padding2,
        activation=activation1,
        input_shape=[
            int(shape1 / (2 ** n_poolingLayer)),
            int(shape2 / (2 ** n_poolingLayer)),
            initial_features * 2 ** (n_poolingLayer - 1),
        ],
    )(output_layer1)
    Layer8 = Conv2DTranspose(
        initial_features,
        kernel_size=kernel_sz,
        strides=strides,
        padding=padding1,
        activation=activation1,
    )(Layer7)
    output_layer2 = Conv2DTranspose(
        channels,
        kernel_size=kernel_sz,
        strides=strides,
        padding=padding1,
        activation=activation2,
    )(Layer8)

    conv_ae = Model(inputs=input_layer, outputs=[output_layer1, output_layer2])
    conv_ae.compile(
        loss={"output_layer1": loss1, "output_layer2": loss1},
        loss_weights=[0, 1],
        optimizer=SGD(lr=lr),
        metrics=[rounded_accuracy],
    )

    conv_ae.fit(
        X_train,
        X_train,
        epochs=epochs,
        validation_data=(X_valid, X_valid),
    )

    conv_ae.summary()

    # features = ?

    return # features

If I do this

X_train, X_valid, y_train, y_valid = get_MNISTdata(data_sz=10000)

and this

features = convoluntional_autoencoder(
    X_train,
    X_valid,
    shape1=28,
    shape2=28,
    channels=1,
    initial_features=16,
    kernel_sz=3,
    padding1="SAME",
    padding2="VALID",
    activation1="selu",
    activation2="sigmoid",
    pool_sz=2,
    strides=2,
    loss1="binary_crossentropy",
    loss2="softmax",
    lr=1.0,
    epochs=1,
)

I get at the column conv_ae.fit(...) the error:

ValueError: Found unexpected losses or metrics that do not correspond to any Model 
output: dict_keys(['output_layer1', 'output_layer2']). Valid mode output names: ['max_pooling2d_5', 'conv2d_transpose_5']. 
Received struct is: {'output_layer1': 'binary_crossentropy', 'output_layer2': 'binary_crossentropy'}.

If I change the conv_ae.compile-column to that:

conv_ae.compile(
        loss=loss1,
        loss_weights=[0, 1],
        optimizer=SGD(lr=lr),
        metrics=[rounded_accuracy],
    )

I get at the column conv_ae.fit(...) the error:

ValueError: Dimensions must be equal, but are 28 and 3 for '{{node binary_crossentropy/mul}} 
= Mul[T=DT_FLOAT](IteratorGetNext:1, binary_crossentropy/Log)' with input shapes: [?,28,28], [?,3,3,64].

Original Q&A

There are 1 best solutions below

**TF_Chinmay** · Answer 1

It is often observed that we tend to mix the ways we split the data in Scikit-learn and TensorFlow.

X_Train , X_Valid , Y_train, Y_Valid is the output of scikit learn's train_test_split function.

However, Tensorflow does it differently. If you still want to use this notation, then a combination of both must be used in the right way . A clear method has been given here

As far as the errors are concerned, I may not have a good idea about it but the Clustering of the MNIST dataset can be executed as :

Train a tf.keras model for the MNIST dataset from scratch. (In a supervised way)
Evaluate the baseline model and save it for later usage
Fine-tune the model by applying the weight clustering API and see the accuracy. (Use Unsupervised learning here)
Fine-tune the model and evaluate the accuracy against baseline. The gist is enclosed in this article.

Unsupervised Learning Autoencoder multiple output CNN for Feature Extraction on MNIST data

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in FEATURE-EXTRACTION

Related Questions in AUTOENCODER

Related Questions in MULTIPLEOUTPUTS

Trending Questions

Popular # Hahtags

Popular Questions