I try to implement a fully-connected model for classification using the MNIST dataset. A part of the code is the following:
n = 5
act_func = 'relu'
classifier = tf.keras.models.Sequential()
classifier.add(layers.Flatten(input_shape = (28, 28, 1)))
for i in range(n):
classifier.add(layers.Dense(32, activation=act_func))
classifier.add(layers.Dense(10, activation='softmax'))
opt = tf.keras.optimizers.SGD(learning_rate=0.01)
classifier.compile(optimizer=opt,loss="categorical_crossentropy",metrics ="accuracy")
classifier.summary()
history = classifier.fit(x_train, y_train, batch_size=32, epochs=3, validation_data=(x_test,y_test))
Is there a way to print the maximum gradient for each layer for a given mini-batch?
You could start off with a custom training loop using
tf.GradientTape
: