I want to compute the Hessian matrix of a keras model w.r.t. its input in graph mode using tf.hessians
.
Here is a minimal example
import tensorflow as tf
from tensorflow import keras
model = keras.Sequential([
keras.Input((10,)),
keras.layers.Dense(1)
])
model.summary()
@tf.function
def get_grads(inputs):
loss = tf.reduce_sum(model(inputs))
return tf.gradients(loss, inputs)
@tf.function
def get_hessian(inputs):
loss = tf.reduce_sum(model(inputs))
return tf.hessians(loss, inputs)
batch_size = 3
test_input = tf.random.uniform((batch_size, 10))
out = model(test_input) # works fine
grads = get_grads(test_input) # works fine
hessian = get_hessian(test_input) # raises ValueError: None values not supported.
While the forward pass and the get_grads
function work fine, the get_hessian
function raises the ValueError: None values not supported.
.
In this example
@tf.function
def get_hessian_(inputs):
loss = tf.reduce_sum(inputs**2)
return tf.hessians(loss, inputs)
get_hessian_(tf.random.uniform((3,)))[0]
# <tf.Tensor: shape=(3, 3), dtype=float32, numpy=
# array([[2., 0., 0.],
# [0., 2., 0.],
# [0., 0., 2.]], dtype=float32)>
tf.hessians
yields the expected result without error.
In your code example,
you are trying to get hessian of
f(x)
(model outputs) w.r.t.x
(inputs) andf
is linear (the model is linear).Hessian of
f(x)
w.r.t.x
should actually be a zero tensor, buttf.hessians
can't handle that properly, resulting the error. Adding an additional layer with non-linear activation will eliminate the error.Codes examples:
Using
tf.hessians
to get hessian:Using
tf.GradientTape()
to get hessian:In case you want to get a zero tensor, you can use
unconnected_gradients=tf.UnconnectedGradients.ZERO