Tensorflow Network Save and Restore

388 Views Asked by At

I am trying to prepare a network with this architecture,

Convolution Layer

Max Pooling Layer

Convolution Layer

Max Pooling Layer

Convolution Layer

Max Pooling Layer

BLSTM Layer 1

BLSTM Layer 2

Fully Connected Layer

CTC Layer

My code structure is like

thisgraph=tf.Graph()

with thisgraph.as_default():
x=tf.placeholder(tf.float32,[None,1,None,nb_features])
    y=tf.sparse_placeholder(tf.int32)
    seq_len=tf.placeholder(tf.int32,[None])

    # 1st CNN Layer followed by MP layer  

    # 2nd CNN Layer followed by MP layer  

    # 3rd CNN Layer followed by MP layer  

    #Now two Bidirectional LSTM

    with tf.variable_scope("cell_def_1"):
        f_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
        b_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)

    with tf.variable_scope("cell_op_1") as opscope:
        outputs,_=tf.nn.bidirectional_dynamic_rnn(f_cell,b_cell,conv_reshape,sequence_length=seq_len,dtype=tf.float32)

    merge=tf.add(outputs[0],outputs[1])

    with tf.variable_scope("cell_def_2"):
        f1_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
        b1_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)

    with tf.variable_scope("cell_op_2") as opscope:
        outputs2,_=tf.nn.bidirectional_dynamic_rnn(f1_cell,b1_cell,merge,sequence_length=seq_len,dtype=tf.float32) 

    # Finally A dense Layer 

    # A Time Distributed dense layer

    loss =tf.nn.ctc_loss(logits_reshape, y, seq_len,time_major=True)
    cost = tf.reduce_mean(loss)

    optimizer = tf.train.RMSPropOptimizer(lr).minimize(cost)

    # Now I Created Network Saver

    new_saver = tf.train.Saver(tf.global_variables())

with tf.Session(graph=thisgraph) as session:    

    if(sys.argv[1]=="load"):
        new_loader=tf.train.Saver(tf.global_variables())
        new_loader.restore(session,"Weights/model_last")
        print("Previous weights loaded")
    else:
        init_op = tf.global_variables_initializer()
        session.run(init_op)
        print("New Weights Initialized")

    # Now the training part

    for e in range(nbepochs):
        for b in range(nbbatches):
            #Loading batch data into feed
            batchloss,batchacc,_= session.run([cost,optimizer], feed)
            totalloss=totalloss+batchloss
        avgloss=totalloss/nbbatches
        new_saver.save(session, "Weights/model_last")

Now, this model is tested with a online handwriting data. The network is working fine. But the problem is that the saving is restoring is not working properly. My CNN layer weight and bias variables have distinct names. I run the training for 100 epochs (pass 1). Then I am loading it again for pass 2 training. I have checked the weights and biases are loaded in pass 2 exactly as they are saved at the end of pass 1. But at the start of pass 2 the training loss is increasing w.r.t the loss at the end of pass 1. What am I doing wrong? Is it due to the optimizer configuration? Any help will be highly appreciated.

0

There are 0 best solutions below