Understanding the structure of Tensorflow datasets

516 Views Asked by At

I am trying to experiment with tensorflow_datasets by importing data into Tensorflow2.0 from Postgres. I have a table aml2 with around 50 records. The table has three columns v1, v2 and class. I want class column to be our label and v1 and v2 as features. v1 and v2 are already normalized float values and class is integer(0 or 1).

##initial connections are defined for Postgres##
##I am able to get data. That isn't a problem##
dataset = tfio.experimental.IODataset.from_sql(
query="SELECT  v1, v2, class FROM aml2;",
endpoint=endpoint)

print(dataset.element_spec)

dataset = dataset.map(lambda item: ((item['v1'], item['v2']), item['class']))##Here I am trying to separate out features and labels
dataset = dataset.prefetch(1)


model = tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(20, activation='relu'),
    layers.Dense(1, activation='sigmoid')

])
model.compile('adam','binary_crossentropy',['accuracy'])

model.fit(dataset, epochs=10)

ValueError: Layer sequential expects 1 inputs, but it received 2 input tensors. Inputs received: 
[<tf.Tensor 'IteratorGetNext:0' shape=() dtype=float32>, <tf.Tensor 'IteratorGetNext:1' shape=() 
dtype=float32>]

My question is how do I resolve it. And generally, how to understand the shape of tensorflow_datasets or IODatasets. Don't I need to have a train_data and label? Tensorflow datasets are a little confusing. Can somebody please explain.

1

There are 1 best solutions below

0
On

It would be easier to not separate your inputs and labels. Instead you could name your Inputs layers and use tf.keras Functional API.

Changes:

  • Got rid of the map method which splits FEATURES from LABEL
  • Changed the code to use functional API instead of sequential API in tf.keras
  • Named all inputs and Outputs
  • model.fit method also works on a dictionary structure with keys matching with input and output names. See the documentation here for arguments x and y.

Here are the changes to make.

from tensorflow.keras import Model

dataset = tfio.experimental.IODataset.from_sql(
query="SELECT  v1, v2, class FROM aml2;",
endpoint=endpoint)

print(dataset.element_spec)

dataset = dataset.prefetch(1).batch(BATCH_SIZE)

input_layers = []

for col in FEATURES:
    input_ = layers.Input(shape=(1), name=col)
    concat_layers.append(input_)

input_layers.append(input_)

x = Concatenate()(input_layers)

x = layers.Flatten()(x)
x = layers.Dense(20, activation='relu')(x)
output = layers.Dense(1, activation='sigmoid', name=LABEL)(x)

model = Model(input_layers, output)
model.compile('adam','binary_crossentropy',['accuracy'])

model.fit(dataset, epochs=10)