I'm trying to train a deep learning model that predicts the probability of each time sample in a two-component time series . In this case, I want the target tensor (Y) to be a probability value for each sample in the time series. For instance, I'm using the gaussian probability distribution. The shape of this objective tensor will be: (N_examples, timesteps, N_classes). In this case, I need to determine three categories, so N_classes = 3. Thus, I need to retrieve one probability mask for each interest class.

The input tensor (X) is formed by the two-component signals with shape: (N_examples, timesteps, N_features). For time series, N_features means the number of components, then N_features = 2.

For this kind of data the temporal dependence is important for its physical implications (seismological stuff).

Next figure shows an example of my data

Signal example

Let me get deeper into the target tensor structure. The blue bell-shaped curve is the gaussian probability distribution that shows the odd value of each time sample to belong to the first interest class (P-wave time arrival). The red curve determines the probability of the time samples to belong to the second class (S-wave time arrival). Finally, the black curve is the probability of what is not considered a wave time arrival (noise class), that's why it is the difference of the sum of the blue and red curves.

I'm using the batch approach to split the training and testing datasets:

dataset = tf.data.Dataset.from_tensor_slices((X_signals,y_masks))

# Batching
BATCH_SIZE = 8
BUFFER_SIZE = 200
NUM_EPOCHS = 100

train_dataset = dataset.take(Ntrain)                 # 75%
test_dataset = dataset.skip(Ntrain).take(Ntest)      # 25%

train_batches = train_dataset.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()
test_batches = test_dataset.batch(BATCH_SIZE)

I've trained a LSTM recurrent neural network for this purpose but predictions are incorrect. I used the return_sequences = True, in order to retrieve the whole time samples from the LSTM layer.

# Hyperparameter definition
n_units = 128
lr = 0.0001
n_class = y_masks.shape[-1]                           # n_class = 3

input_shape = tf.TensorShape(X_signals[0,...].shape)  # (None, timesteps = 2000, N_features = 2)

input = layers.Input(shape=input_shape)
x = LSTM(units=n_units, dropout=0.2, recurrent_dropout=0.1, return_sequences=True)(input)
output = Dense(units=n_class, activation='softmax')(x)

model_name = "Simple_LSTM_Net_1D"
model_lstm = tf.keras.Model(input, output, name= model_name)

model_lstm.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=lr),
                   loss='categorical_crossentropy',
                   metrics=["accuracy"])

# Fitting the model
NUM_EPOCHS = 50
STEPS_PER_EPOCH = Ninput // BATCH_SIZE
VAL_SUBSPLITS = 10
VALIDATION_STEPS = Ntest // BATCH_SIZE // VAL_SUBSPLITS

stop_train_loss = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)
stop_train_accu = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=5)

model_history = model_lstm.fit(train_batches,
                               epochs = NUM_EPOCHS,
                               steps_per_epoch =  STEPS_PER_EPOCH,
                               validation_steps = VALIDATION_STEPS,
                               validation_data =  test_batches,
                               callbacks=[stop_train_loss,stop_train_accu])

The expected output for each example is the predicted probability:

Expected output from neural network

Instead, I got this

Prediction

I don't know if the architecture is not enough complex for the solution of this problem. Or I'm setting the wrong hyperparameters. I'm not sure if the Dense layer configuration is correct for the expected predictions.

Any help or comment will be greatly appreciated!

0

There are 0 best solutions below