Implementing Aggregator Neural Networks with TensorFlow 2.0

72 Views Asked by At

I am attempting to design a neural network model using Tensorflow with the following specifications:

The model accepts two inputs: X, a list of n 3-dimensional vectors, and Y, a list of n ascending natural numbers starting from 0. It produces an output Z, consisting of m 3-dimensional vectors.

Y contains m unique numbers, each representing a class of 3-dimensional input vectors. The number of input vectors per class may vary.

The model's architecture consists of three layers. The first layer transforms each vector in X into a 2-dimensional vector and applies the 'gelu' activation function. The second layer performs a 'segment_sum' operation to condense the n 2-dimensional vectors into m 2-dimensional vectors, using Y as the guide. The third layer then transforms the m 2-dimensional vectors into the desired m 3-dimensional output vectors.

I utilize cosine dissimilarity loss and the Adam optimizer to train the model.

Below is the code I've developed for this purpose:

import numpy as np
import tensorflow as tf
from tensorflow import keras

# Prepare the input and output data (example)
n = 10
m = 4
X = np.random.random((n, 3)).astype('float32')
Y = np.array([0, 0, 1, 1, 2, 2, 2, 3, 3, 3]).astype('int32')
Z = np.random.random((m, 3)).astype('float32')


class CustomModel(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(2, activation='gelu')
        self.dense2 = keras.layers.Dense(3)

    def call(self, inputs):
        X, Y = inputs
        X = self.dense1(X)
        X = tf.math.segment_sum(X, Y)
        Z = self.dense2(X)
        return Z


model = CustomModel()

model.compile(loss=tf.keras.losses.CosineSimilarity(axis=1), optimizer=tf.keras.optimizers.Adam())

model.fit([X, Y], Z, epochs=10)

Motivated by the Deep Sets paper, the model is designed to learn an aggregation function.

Let f be a permutation-invariant aggregation function over a set of vectors A. Then, the function can be written as: f(A) = \rho(\sum_{a \in A} \phi(a))

So, if we learn \rho and \phi, we learn the function f, essentially what the above model is expected to do.

However, I get the following error:

Traceback (most recent call last):
  File "/home/nitesh/PycharmProjects1/pythonProject/research/reasoning_with_vectors/custom_model.py", line 31, in <module>
    model.fit([X, Y], Z, epochs=10)
  File "/home/nitesh/miniconda3/envs/relbert/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/nitesh/miniconda3/envs/relbert/lib/python3.10/site-packages/keras/engine/data_adapter.py", line 1852, in _check_data_cardinality
    raise ValueError(msg)
ValueError: Data cardinality is ambiguous:
  x sizes: 10, 10
  y sizes: 4
Make sure all arrays contain the same number of samples.

I tried a lot, but I didn't find any way to encode the model using TensorFlow 2.0. I tried asking GPT4 and Bard but didn't get any satisfactory answers.

0

There are 0 best solutions below