Tensorflow Gradient tape 'unknown value for unconnected gradients'

462 Views Asked by At

I'm trying to understand why im getting an error when using gradient tape to take the derivative of a function. Try to take the derivative of Power with respect to T, defined as:

    import tensorflow as tf
    import numpy as np
    from scipy.fft import fft, fftfreq, fftn
    import tensorflow.python.ops.numpy_ops.np_config as np_config
    np_config.enable_numpy_behavior()

    #####Initialize Values######

    s1 = np.array([[0,1,0],
                   [1,0,1],
                   [0,1,0]])

    s2 = np.array([[0,-1j,0],
                   [1j,0,-1j],
                   [0,1j,0]])

    s3 = np.array([[1,0,0],
                  [0,0,0],
                  [0,0,-1]])

    spin1 = (1/np.sqrt(2))*s1
    spin2 = (1/np.sqrt(2))*s2
    spin3 = (1/np.sqrt(2))*s3

    spin1 = tf.constant(spin1)
    spin2 = tf.constant(spin2)
    spin3 = tf.constant(spin3)

    a = tf.constant(1.0)
    b = tf.constant(1.0)
    c = tf.constant(1.0)
    d = tf.constant(1.0)

    v = tf.constant(1.0)     # ~N(0,sigma_v)
    w = tf.constant(1.0)     # ~N(0,sigma_w)

    c0_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))
    c1_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))

    ###### Define Functions########

    def getDE(T):
        D = a*T+b+v
        E = c*T+d+w
        return D,E

    def H(D,E):
        return D*(spin3**2 - 2/3) + E*(spin1**2-spin2**2)

    def psi(t,eigenvalues,eigenvec1, eigenvec2):
        c_0 = np.array(np.exp(-1j*(eigenvalues[0])*t)*c0_0)
        c_0.shape = (N,1)
        c_1 = np.array(np.exp(-1j*(eigenvalues[1])*t)*c1_0)
        c_1.shape = (N,1)
        return c_0*(eigenvec1.T)+c_1*(eigenvec2.T)

    def forward(T):
        T = tf.Variable(T)
        with tf.GradientTape() as tape:
            D,E = getDE(T)
            H_tf = H(D,E)
            eigenvalues, eigenstates = tf.linalg.eig(H_tf)
            eigenvec1 = eigenstates[:,0]
            eigenvec2 = eigenstates[:,1]
            wave = psi(t,eigenvalues,eigenvec1, eigenvec2)
            a = np.abs(tf.signal.fft2d(wave))**2
            Power = np.full([100,1], None)
            for i in range(N):
                Power[i,:] = a[i,:].conj().T@a[i,:]
        
        return tape.gradient(Power,T)

If someone could tell me if I'm doing this correctly or if there is a better way to do it, as I am not very familiar with auto differentiation in python.

In the forward function taking the derivative of wave with respect to T seems to work, but as soon as I do the fft I get the following error:

WARNING:tensorflow:The dtype of the target tensor must be floating (e.g. tf.float32) when calling GradientTape.gradient, got dtype('O')

    AttributeError                            Traceback (most recent call last)
    ~\AppData\Local\Temp/ipykernel_352/3452884380.py in <module>
    ----> 1 T_hat = forward(17.0)
          2 print(T_hat)

    ~\AppData\Local\Temp/ipykernel_352/2053063608.py in forward(T)
         13             Power[i,:] = a[i,:].conj().T@a[i,:]
         14 
    ---> 15     return tape.gradient(Power,T)

    ~\anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\backprop.py in 
    gradient(self, target, sources, output_gradients, unconnected_gradients)
       1072                           for x in nest.flatten(output_gradients)]
       1073 
    -> 1074     flat_grad = imperative_grad.imperative_grad(
       1075         self._tape,
       1076         flat_targets,

    ~\anaconda3\envs\tensorflow-gpu\lib\site- 
   packages\tensorflow\python\eager\imperative_grad.py in imperative_grad(tape, target, 
    sources, output_gradients, sources_raw, unconnected_gradients)
         69         "Unknown value for unconnected_gradients: %r" % unconnected_gradients)
         70 
    ---> 71   return pywrap_tfe.TFE_Py_TapeGradient(
         72       tape._tape,  # pylint: disable=protected-access
         73       target,

    AttributeError: 'numpy.ndarray' object has no attribute '_id'
1

There are 1 best solutions below

1
On

I hope you have already found an answer to your question. But if you haven't maybe this will give some light.

The problem that you are seen is because Tensorflow can't calculate the gradient of the overall forward I would recommend stopping using NumPy methods.

As long I can see, you can change all those NumPy methods by TensorFlow implemented.

example:

To calculate the magnitude of complex tensor

magnitude = tf.math.abs(complex_tensor)

To use the complex exponential

complex_tensor = tf.math.exp(tf.complex(0.0, -1.0)*tf.cast(phase, "complex64"))

To extract elements on a given dimension

elm1, elm2 = tf.unstack(x, num=2, axis = -1)

To calculate conjugate of a complex tensor

a_conj = tf.math.conj(a)

To transpose or permute the tensor dimensions

x_T = tf.transpose(x, perm = [1, 0])

To summarize, stop using NumPy methods and find the Tensorflow alternatives, that will solve your problems.