Suppose I have a custom layer which computes the loss for me, using external trainable variables using TF 2.4 (and yes, I know it's a silly example and loss, it is just for reproducibility, the actual loss is much more complex):
import numpy as np
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from tensorflow.keras.layers import Dense, Layer, Input
from tensorflow.keras import Model
from tensorflow.keras.callbacks import EarlyStopping
import tensorflow as tf
n_col = 10
n_row = 1000
X = np.random.normal(size=(n_row, n_col))
beta = np.arange(10)
y = X @ beta
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
class MyLoss(Layer):
    def __init__(self, var1, var2):
        super(MyLoss, self).__init__()
        self.var1 = tf.Variable(var1)
        self.var2 = tf.Variable(var2)
    def get_vars(self):
        return self.var1, self.var2
    def custom_loss(self, y_true, y_pred):
        return self.var1 ** 2 * tf.math.reduce_mean(tf.math.square(y_true-y_pred)) + self.var2 ** 2
    def call(self, y_true, y_pred):
        self.add_loss(self.custom_loss(y_true, y_pred))
        return y_pred
inputs = Input(shape=(X_train.shape[1],))
y_input = Input(shape=(1,))
hidden1 = Dense(10)(inputs)
output = Dense(1)(hidden1)
my_loss = MyLoss(0.5, 0.5)(y_input, output) # here can also initialize those var1, var2
model = Model(inputs=[inputs, y_input], outputs=my_loss)
model.compile(optimizer= 'adam')
Training this model is simple:
history = model.fit([X_train, y_train], None,
                    batch_size=32, epochs=100, validation_split=0.1, verbose=0,
                    callbacks=[EarlyStopping(monitor='val_loss', patience=5)])
And if we write a custom Callback or train epoch by epoch we can see how var1 and var2 converge to 0 as would be expected:
var1_list = []
var2_list = []
for i in range(100):
    if i % 10 == 0:
        print('step %d' % i)
    model.fit([X_train, y_train], None,
              batch_size=32, epochs=1, validation_split=0.1, verbose=0)
    var1, var2 = model.layers[-1].get_vars()
    var1_list.append(var1.numpy())
    var2_list.append(var2.numpy())
plt.plot(var1_list, label='var1')
plt.plot(var2_list, 'r', label='var2')
plt.legend()
plt.show()
Short question: how do I make the model stop (EarlyStopping with some patience) according to the convergence of var1 and var2 (i.e. their vector size, self.var1**2 + self.var2**2, and again assume the loss is much more complex and you cannot just add this vector size to the loss)?
Longer question: (if you have the time/patience)
- Is it possible to implement a custom Metricand makeEarlyStoppingtrack it?
- In which case how would you make EarlyStoppingfocus on "convergence" when all its got ismode"min" or "max"? (I wonder could we extendEarlyStoppinginstead of extendingCallback)
- Can we do this without a metric, with a custom Callback?
- How would we combine the custom loss above, telling EarlyStoppingto pay attention to both, i.e. "stop if you don't see improvement in loss AND improvement in convergence for patience=10"?

 
                        
Well at least for the "shorter question" this turned out quite simple, following this example from TF docs, implementing
EarlyStoppingwith the twist of focusing on the variables norm:Then the model would be run with: