TensorFlow: How to embed float sequences to fixed size vectors?

2.5k Views Asked by At

I am looking methods to embed variable length sequences with float values to fixed size vectors. The input formats as following:

[f1,f2,f3,f4]->[f1,f2,f3,f4]->[f1,f2,f3,f4]-> ... -> [f1,f2,f3,f4] [f1,f2,f3,f4]->[f1,f2,f3,f4]->[f1,f2,f3,f4]->[f1,f2,f3,f4]-> ... -> [f1,f2,f3,f4] ... [f1,f2,f3,f4]-> ... -> ->[f1,f2,f3,f4]

Each line is a variable length sequnece, with max length 60. Each unit in one sequece is a tuple of 4 float values. I have already paded zeros to fill all sequences to the same length.

The following architecture seems solve my problem if I use the output as the same as input, I need the thought vector in the center as the embedding for the sequences.

In tensorflow, I have found tow candidate methods tf.contrib.legacy_seq2seq.basic_rnn_seq2seq and tf.contrib.legacy_seq2seq.embedding_rnn_seq2seq.

However, these tow methos seems to be used to solve NLP problem, and the input must be discrete value for words.

So, is there another functions to solve my problems?

3

There are 3 best solutions below

0
On BEST ANSWER

I have found a solution to my problem, using the following architecture,

,

The LSTMs layer below encode the series x1,x2,...,xn. The last output, the green one, is duplicated to the same count as the input for the decoding LSTM layers above. The tensorflow code is as following

series_input = tf.placeholder(tf.float32, [None, conf.max_series, conf.series_feature_num])
print("Encode input Shape", series_input.get_shape())

# encoding layer
encode_cell = tf.contrib.rnn.MultiRNNCell(
  [tf.contrib.rnn.BasicLSTMCell(conf.rnn_hidden_num, reuse=False) for _ in range(conf.rnn_layer_num)]
)
encode_output, _ = tf.nn.dynamic_rnn(encode_cell, series_input, dtype=tf.float32, scope='encode')
print("Encode output Shape", encode_output.get_shape())

# last output
encode_output = tf.transpose(encode_output, [1, 0, 2])
last = tf.gather(encode_output, int(encode_output.get_shape()[0]) - 1)

# duplite the last output of the encoding layer
decoder_input = tf.stack([last for _ in range(conf.max_series)], axis=1)
print("Decoder input shape", decoder_input.get_shape())

# decoding layer
decode_cell = tf.contrib.rnn.MultiRNNCell(
  [tf.contrib.rnn.BasicLSTMCell(conf.series_feature_num, reuse=False) for _ in range(conf.rnn_layer_num)]
)
decode_output, _ = tf.nn.dynamic_rnn(decode_cell, decoder_input, dtype=tf.float32, scope='decode')
print("Decode output", decode_output.get_shape())

# Loss Function
loss = tf.losses.mean_squared_error(labels=series_input, predictions=decode_output)
print("Loss", loss)
1
On

All you need is only an RNN, not the seq2seq model, since seq2seq goes with an additional decoder which is unecessary in your case.

An example code:

import numpy as np
import tensorflow as tf
from tensorflow.contrib import rnn

input_size = 4
max_length = 60
hidden_size=64
output_size = 4

x = tf.placeholder(tf.float32, shape=[None, max_length, input_size], name='x')
seqlen = tf.placeholder(tf.int64, shape=[None], name='seqlen')

lstm_cell = rnn.BasicLSTMCell(hidden_size, forget_bias=1.0)

outputs, states = tf.nn.dynamic_rnn(cell=lstm_cell, inputs=x, sequence_length=seqlen, dtype=tf.float32)


encoded_states = states[-1]

W = tf.get_variable(
        name='W',
        shape=[hidden_size, output_size],
        dtype=tf.float32, 
        initializer=tf.random_normal_initializer())
b = tf.get_variable(
        name='b',
        shape=[output_size],
        dtype=tf.float32, 
        initializer=tf.random_normal_initializer())

z = tf.matmul(encoded_states, W) + b
results = tf.sigmoid(z)

###########################
## cost computing and training components goes here
# e.g. 
# targets = tf.placeholder(tf.float32, shape=[None, input_size], name='targets')
# cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=targets, logits=z))
# optimizer = tf.train.AdamOptimizer(learning_rate=0.1).minimize(cost)
###############################

init = tf.global_variables_initializer()



batch_size = 4
data_in = np.zeros((batch_size, max_length, input_size), dtype='float32')
data_in[0, :4, :] = np.random.rand(4, input_size)
data_in[1, :6, :] = np.random.rand(6, input_size)
data_in[2, :20, :] = np.random.rand(20, input_size)
data_in[3, :, :] = np.random.rand(60, input_size)
data_len = np.asarray([4, 6, 20, 60], dtype='int64')



with tf.Session() as sess:
    sess.run(init)
    #########################
    # training process goes here
    #########################
    res = sess.run(results, 
            feed_dict={
                x: data_in, 
                seqlen: data_len})

print(res)
2
On

To encode sequence to a fixed length vector you typically use recurrent neural networks (RNNs) or convolutional neural networks (CNNs).

If you use a recurrent neural network you can use the output at the last time step (last element in your sequence). This corresponds to the thought vector in your question. Have a look at tf.dynamic_rnn. dynamic_rnn requires you to specify to type of RNN cell you want to use. tf.contrib.rnn.LSTMCell and tf.contrib.rnn.GRUCell are most common.

If you want to use CNNs you need to use 1 dimensional convolutions. To build CNNs you need tf.layers.conv1d and tf.layers.max_pooling1d