I want to get features from a dataset (here MNIST) to put them into a cluster algorithm like k-means. The feature generation should be unsuperviesed, so I choose autoencoding to do it. I want that the Autoencoder is only trained by the loss in the second output layer 'output_layer2' and after the training I want to get the features from the first output layer 'output_layer1'. To get two output layer I choose funtional API in keras. But I don't know how to get the features and how to train only on one loss. Furthermore I get error massages like the ones under my code:
import numpy as np
import sys
import sklearn
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Input, Reshape, Conv2D, MaxPool2D, Conv2DTranspose
from tensorflow.keras.optimizers import SGD
assert sys.version_info >= (3, 5)
# Scikit-Learn ≥0.20 is required
assert sklearn.__version__ >= "0.20"
# TensorFlow ≥2.0-preview is required
assert tf.__version__ >= "2.0"
def get_MNISTdata(data_sz=10000):
(X_train_full, y_train_full), (
X_test,
y_test,
) = keras.datasets.mnist.load_data()
X_train_full = X_train_full.astype(np.float32) / 255
X_test = X_test.astype(np.float32) / 255
data_tr = X_train_full.shape[0] - data_sz
X_train, X_valid = X_train_full[:-data_tr], X_train_full[-data_tr:]
y_train, y_valid = y_train_full[:-data_tr], y_train_full[-data_tr:]
return X_train, X_valid, y_train, y_valid
def rounded_accuracy(y_true, y_pred):
return keras.metrics.binary_accuracy(tf.round(y_true), tf.round(y_pred))
def convoluntional_autoencoder(
X_train,
X_valid,
shape1=28,
shape2=28,
channels=1,
initial_features=16,
kernel_sz=3,
padding1="SAME",
padding2="VALID",
activation1="selu",
activation2="sigmoid",
pool_sz=2,
strides=2,
loss1="binary_crossentropy",
loss2="softmax",
lr=1.0,
epochs=3,
):
tf.random.set_seed(42)
np.random.seed(42)
# encoder
input_layer = Input(shape=(shape1, shape2, channels))
Layer1 = Conv2D(
initial_features,
kernel_size=kernel_sz,
padding=padding1,
activation=activation1,
)(input_layer)
Layer2 = MaxPool2D(pool_size=pool_sz)(Layer1)
Layer3 = Conv2D(
initial_features * 2,
kernel_size=kernel_sz,
padding=padding1,
activation=activation1,
)(Layer2)
Layer4 = MaxPool2D(pool_size=pool_sz)(Layer3)
Layer5 = Conv2D(
initial_features * 4,
kernel_size=kernel_sz,
padding=padding1,
activation=activation1,
)(Layer4)
output_layer1 = MaxPool2D(pool_size=pool_sz)(Layer5)
n_poolingLayer = 3
# decoder
Layer7 = Conv2DTranspose(
initial_features * 2,
kernel_size=kernel_sz,
strides=strides,
padding=padding2,
activation=activation1,
input_shape=[
int(shape1 / (2 ** n_poolingLayer)),
int(shape2 / (2 ** n_poolingLayer)),
initial_features * 2 ** (n_poolingLayer - 1),
],
)(output_layer1)
Layer8 = Conv2DTranspose(
initial_features,
kernel_size=kernel_sz,
strides=strides,
padding=padding1,
activation=activation1,
)(Layer7)
output_layer2 = Conv2DTranspose(
channels,
kernel_size=kernel_sz,
strides=strides,
padding=padding1,
activation=activation2,
)(Layer8)
conv_ae = Model(inputs=input_layer, outputs=[output_layer1, output_layer2])
conv_ae.compile(
loss={"output_layer1": loss1, "output_layer2": loss1},
loss_weights=[0, 1],
optimizer=SGD(lr=lr),
metrics=[rounded_accuracy],
)
conv_ae.fit(
X_train,
X_train,
epochs=epochs,
validation_data=(X_valid, X_valid),
)
conv_ae.summary()
# features = ?
return # features
If I do this
X_train, X_valid, y_train, y_valid = get_MNISTdata(data_sz=10000)
and this
features = convoluntional_autoencoder(
X_train,
X_valid,
shape1=28,
shape2=28,
channels=1,
initial_features=16,
kernel_sz=3,
padding1="SAME",
padding2="VALID",
activation1="selu",
activation2="sigmoid",
pool_sz=2,
strides=2,
loss1="binary_crossentropy",
loss2="softmax",
lr=1.0,
epochs=1,
)
I get at the column conv_ae.fit(...)
the error:
ValueError: Found unexpected losses or metrics that do not correspond to any Model
output: dict_keys(['output_layer1', 'output_layer2']). Valid mode output names: ['max_pooling2d_5', 'conv2d_transpose_5'].
Received struct is: {'output_layer1': 'binary_crossentropy', 'output_layer2': 'binary_crossentropy'}.
If I change the conv_ae.compile
-column to that:
conv_ae.compile(
loss=loss1,
loss_weights=[0, 1],
optimizer=SGD(lr=lr),
metrics=[rounded_accuracy],
)
I get at the column conv_ae.fit(...)
the error:
ValueError: Dimensions must be equal, but are 28 and 3 for '{{node binary_crossentropy/mul}}
= Mul[T=DT_FLOAT](IteratorGetNext:1, binary_crossentropy/Log)' with input shapes: [?,28,28], [?,3,3,64].
It is often observed that we tend to mix the ways we split the data in Scikit-learn and TensorFlow.
X_Train , X_Valid , Y_train, Y_Valid
is the output of scikit learn's train_test_split function.However, Tensorflow does it differently. If you still want to use this notation, then a combination of both must be used in the right way . A clear method has been given here
As far as the errors are concerned, I may not have a good idea about it but the Clustering of the MNIST dataset can be executed as :
Train a tf.keras model for the MNIST dataset from scratch. (In a supervised way)
Evaluate the baseline model and save it for later usage
Fine-tune the model by applying the weight clustering API and see the accuracy. (Use Unsupervised learning here)
Fine-tune the model and evaluate the accuracy against baseline. The gist is enclosed in this article.