I'm trying to modify the deep dream code from the Tensorflow docs here: https://www.tensorflow.org/tutorials/generative/deepdream
Specifically, I want to use a "guide image" to produce the dream features. This was originally shown in Caffe in this notebook (at the bottom): https://github.com/google/deepdream/blob/master/dream.ipynb
In their example, they used an image of flowers and produced flower-like features on top of an image of clouds. To do this, they provide an alternate loss function. From the Caffe notebook:
Instead of maximizing the L2-norm of current image activations, we try to maximize the dot-products between activations of current image, and their best matching correspondences from the guide image.
In Caffe, it looks like this:
end = 'inception_3b/output'
h, w = guide.shape[:2]
src, dst = net.blobs['data'], net.blobs[end]
src.reshape(1,3,h,w)
src.data[0] = preprocess(net, guide)
net.forward(end=end)
guide_features = dst.data[0].copy()
def objective_guide(dst):
x = dst.data[0].copy()
y = guide_features
ch = x.shape[0]
x = x.reshape(ch,-1)
y = y.reshape(ch,-1)
A = x.T.dot(y) # compute the matrix of dot-products with guide features
dst.diff[0].reshape(ch,-1)[:] = y[:,A.argmax(1)] # select ones that match best
I translated this to Tensorflow like so:
def get_activations(img, model):
# Pass forward the image through the model to retrieve the activations.
# Converts the image into a batch of size 1.
img_batch = tf.expand_dims(img, axis=0)
layer_activations = model(img_batch)
if len(layer_activations) == 1:
layer_activations = [layer_activations]
return layer_activations
guide_activations = get_activations(img, model)
def maximize_to_guide(img, model):
layer_activations = get_activations(img, model)
losses = []
for guide_activation in guide_activations:
for layer_activation in layer_activations:
ch = layer_activation.shape[-1]
layer_activation = tf.reshape(layer_activation, (ch, -1))
guide_activation = tf.reshape(guide_activation, (ch, -1))
dot = tf.matmul(tf.transpose(layer_activation), guide_activation)
max_act_idx = tf.math.argmax(dot, axis=1)
max_act = tf.gather(guide_activation, max_act_idx, axis=1)
loss = tf.math.reduce_mean(max_act)
losses.append(loss)
return tf.reduce_sum(losses)
However, tape.gradient(loss, img)
returns None
. I thought that it was because argmax
is not differentiable. However, if I gather from the layer_activations
instead -- tf.gather(layer_activation, max_act_idx, axis=1)
-- then it produces a gradient (but not the desired image). So it's clearly able to step back through the tape, from the returned loss value to the input image, but only in this second case. What's going on here?