I am trying to prepare a network with this architecture,
Convolution Layer
Max Pooling Layer
Convolution Layer
Max Pooling Layer
Convolution Layer
Max Pooling Layer
BLSTM Layer 1
BLSTM Layer 2
Fully Connected Layer
CTC Layer
My code structure is like
thisgraph=tf.Graph()
with thisgraph.as_default():
x=tf.placeholder(tf.float32,[None,1,None,nb_features])
y=tf.sparse_placeholder(tf.int32)
seq_len=tf.placeholder(tf.int32,[None])
# 1st CNN Layer followed by MP layer
# 2nd CNN Layer followed by MP layer
# 3rd CNN Layer followed by MP layer
#Now two Bidirectional LSTM
with tf.variable_scope("cell_def_1"):
f_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
b_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
with tf.variable_scope("cell_op_1") as opscope:
outputs,_=tf.nn.bidirectional_dynamic_rnn(f_cell,b_cell,conv_reshape,sequence_length=seq_len,dtype=tf.float32)
merge=tf.add(outputs[0],outputs[1])
with tf.variable_scope("cell_def_2"):
f1_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
b1_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
with tf.variable_scope("cell_op_2") as opscope:
outputs2,_=tf.nn.bidirectional_dynamic_rnn(f1_cell,b1_cell,merge,sequence_length=seq_len,dtype=tf.float32)
# Finally A dense Layer
# A Time Distributed dense layer
loss =tf.nn.ctc_loss(logits_reshape, y, seq_len,time_major=True)
cost = tf.reduce_mean(loss)
optimizer = tf.train.RMSPropOptimizer(lr).minimize(cost)
# Now I Created Network Saver
new_saver = tf.train.Saver(tf.global_variables())
with tf.Session(graph=thisgraph) as session:
if(sys.argv[1]=="load"):
new_loader=tf.train.Saver(tf.global_variables())
new_loader.restore(session,"Weights/model_last")
print("Previous weights loaded")
else:
init_op = tf.global_variables_initializer()
session.run(init_op)
print("New Weights Initialized")
# Now the training part
for e in range(nbepochs):
for b in range(nbbatches):
#Loading batch data into feed
batchloss,batchacc,_= session.run([cost,optimizer], feed)
totalloss=totalloss+batchloss
avgloss=totalloss/nbbatches
new_saver.save(session, "Weights/model_last")
Now, this model is tested with a online handwriting data. The network is working fine. But the problem is that the saving is restoring is not working properly. My CNN layer weight and bias variables have distinct names. I run the training for 100 epochs (pass 1). Then I am loading it again for pass 2 training. I have checked the weights and biases are loaded in pass 2 exactly as they are saved at the end of pass 1. But at the start of pass 2 the training loss is increasing w.r.t the loss at the end of pass 1. What am I doing wrong? Is it due to the optimizer configuration? Any help will be highly appreciated.