I build my dataset (ds) using this fonction:
def build_dataset(path_list, index_variables_retenus, header=True, field_delim=','):
def csv_loader(path):
return tf.data.experimental.CsvDataset(
path,
record_defaults=[tf.string]+[tf.float32]*(len(variables_a_extraire)-1),
header=header,
field_delim=field_delim, #,
select_cols=variables_a_extraire
)
# ajouter la colonne des ID pour permettre de filtrer les échantillons mélangeant 2 bvs
variables_a_extraire = [0] + index_variables_retenus
record_defaults =[tf.string]+[tf.float32]*(len(variables_a_extraire)-1)
# cr.er un tensor list
tf_list=tf.data.Dataset.list_files(path_list,shuffle=True)
return tf_list.interleave(csv_loader, cycle_length=1)
"record_defaults" is define with a tf.string on first variable because I add an ID information to my data for later process. Next step I need to run this line:
ds = ds.map(lambda *items: tf.stack(items))
and get this error:
TypeError: Tensors in list passed to 'values' of 'Pack' Op have types [string, float32, float32, float32, float32] that don't all match.
ValueError: values_1: Tensor conversion requested dtype string for Tensor with dtype float32: <tf.Tensor 'args_1:0' shape=() dtype=float32>
My understand of the error is that my variable define as tf.string is the problem, I try to remove it and all work great.
I'm looking around on the web but don't find why it don't work, the values of this columns in the original csv= camelsaus_102101A...
thank all for your help