Currently I am working in NLP project involved with word embedding. I am using the pre-train Bert model from tensorflow hub, but when I feed my whole dataset into Bert layer, the Kernel just died. So I figure out each time feeding small amount of data and storing the 'pooled_output' in csv file so I can keep and use them later.
pooled_output, sequence_output = bert_layer([input_ids, input_masks, segment_ids])
instances[i]['vector'] = pooled_output.numpy()
array_df = pd.read_csv('bert_data.csv')
The output for that csv is : vector 0 [array([[-7.61249065e-01, -2.19767481e-01, -2.... 1 [array([[-0.72692966, -0.49777552, -0.89455396... 2 [array([[-0.7832544 , -0.46121675, -0.85719764...
The problem is how do I use it for my model from a csv file ? I have put my model below. Thank you
import tensorflow as tf
from tensorflow.keras import regularizers
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(768,activation = 'relu',
kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4),
bias_regularizer=regularizers.l2(1e-4),
activity_regularizer=regularizers.l2(1e-5)))
model.compile(optimizer = 'adam',
loss = 'sparse_categorical_crossentropy',
metrics = ['accuracy'])
model.fit(X,y_train,epochs = 30)