Image sequence detection with Keras, Convolutional and Stateful Neural Network

Question

Image sequence detection with Keras, Convolutional and Stateful Neural Network

203 Views Asked by Finest Whiskey At 18 April 2025 at 10:03

I am trying to write a pretty complicated neural network (at least for me) in keras that needs to combine both a common CNN structure and an LSTM/GRU layer.

Basically, I have a dataset of climatological maps of the Mediterranean sea, each map details the wind, pressure and other parameters. I am studying Medicanes (Mediterranean hurricanes) and my goal is to create a neural network that can classify each map with a label zero if there is no trace of such hurricanes or one if the map contains one.

In order to achieve that I need a network with two parts:

feature extractor (normal CNN).
temporal layer (LSTM/GRU).

The main cause of this is that each map is correlated with the previous one because the formation and life cycle of a Medicane can take several days to complete.

Important note: the dataset is too big to be uploaded all at once so I have to work one batch at a time.

I am working with Keras and I found it pretty challenging to adapt its standard framework to my needs so I have come up with some peculiar flow to feed my data into the network.

In particular, I found it hard to pass both my batch size and my time-step parameter to the GRU layer using a more standard alternative.

This is what I tried:

I am positively sure I have overcomplicated the task, but, as I said I am not very proficient with Keras and TensorFlow.

The main problem was that I could not find a way to import the data both in a batch (for RAM reasons) and in a sequence of 10-15 pictures (to be used as the time steps in the GRU layer).

I solved this problem by importing batches of 120 maps in order (no shuffle) and I created a way to turn these batches into the sequence of images I needed then I proceeded to re-batch the sequences and feed them to the model manually.

Data Import

batch_size=120

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    "./Figures_1/Train",
    validation_split=None,
    subset=None,
    labels="inferred",
    label_mode="binary",
    color_mode="rgb",
    interpolation='bilinear',
    batch_size=batch_size,
    image_size=(600, 600),
    shuffle=False,
    seed=123
)

Get a sequence of Images

Here, I break down the 120 map batches into sequences of 60 observations, and I return each sequence one at a time.

sequence_lengh=60

def sequence_x(train_dataset):
    
    x_numpy = np.asarray(list(map(lambda x: x[0], tfds.as_numpy(train_dataset))),dtype=object)
    
    for element in range(0,x_numpy.shape[0]):
        for i in range(0, x_numpy.shape[0],sequence_lengh):
            x_seq = x_numpy[element][i:i+sequence_lengh]
            yield x_seq
        
def sequence_y(train_dataset):
    
    y_numpy = np.asarray(list(map(lambda x: x[1], tfds.as_numpy(train_dataset))),dtype=object)
    
    for element in range(0,y_numpy.shape[0]):
        for i in range(0, y_numpy.shape[0],sequence_lengh):
            y_seq = y_numpy[element][i:i+sequence_lengh]
            yield y_seq

CNN Model

I build the CNN model based on a pre-trained DenseNet

from keras.layers import TimeDistributed, GRU

def build_convnet(shape=(600, 600, 3)):
    
    inputs = keras.Input(shape = shape)
    x = inputs

    # preprocessing
    x = keras.applications.densenet.preprocess_input(x)

    #Convbase
    x = convBase(x)
    x = layers.Flatten()(x)

    # Fine tuning
    x = keras.layers.Dense(1024, activation='relu')(x)
    x = layers.Dropout(0.2)(x)
    x = keras.layers.Dense(512, activation='relu')(x)
    x = keras.layers.GlobalMaxPool2D()
    
    return x

GRU Model

I build the time part of the network with a GRU layer

def action_model(shape=(15, 600, 600, 3), nbout=15):
    # Create our convnet with (112, 112, 3) input shape
    convnet = build_convnet(shape[1:]) #[1:]
    
    # then create our final model
    model = keras.Sequential()
    # add the convnet with (5, 112, 112, 3) shape
    model.add(TimeDistributed(convnet, input_shape=shape))
    # here, you can also use GRU or LSTM
    model.add(GRU(64))
    # and finally, we make a decision network
    model.add(Dense(1024, activation='relu'))
    model.add(Dropout(.5))
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(.5))
    model.add(Dense(128, activation='relu'))
    model.add(Dropout(.5))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(15, activation='softmax'))
    return model

Transfer Learning

I retrain a part of the GRU

convBase = DenseNet121(include_top=False, weights=None, input_shape=(600,600,3), pooling="avg")

for layer in convBase.layers: 
  if 'conv5' in layer.name:
    layer.trainable = True
for layer in convBase.layers: 
  if 'conv4' in layer.name:
    layer.trainable = True

Model Compile

Model compilation ( image size= 600x600x3)

INSHAPE=(15, 600, 600, 3) # (5, 112, 112, 3)
model = action_model(INSHAPE, 1)
optimizer = keras.optimizers.Adam(0.001)

model.compile(
    optimizer,
    'categorical_crossentropy',
    metrics='accuracy'
)

Model Fit

Here I manually batch my data. I turn an array (60, 600, 600, 3) into a (4,15,600,600) array. Meaning 4 batches each one containing a 15-map long sequence.


epochs = 10

for value in range(0, epochs):
    
    train_x, train_y = sequence_x(train_ds), sequence_y(train_ds)
    val_x, val_y = sequence_x(validation_ds), sequence_y(validation_ds)
    
    for i in range(0,278): #
        
        x = next(train_x, "none")
        y = next(train_y, "none")
        
        if (x!="none" or y!="none"):

            if (np.any(x) and np.any(y)):

                x_stack = np.stack((x[:15], x[15:30], x[30:45], x[45:]))
                y_stack = np.stack((y[:15], y[15:30], y[30:45], y[45:]))
                y_stack=y_stack.reshape(4,15)

                model.fit(x=x_stack, y=y_stack, 
                            validation_data=None, 
                            batch_size=None,
                            shuffle=False
                            )

            else:
                continue
        else:
            continue

The idea is to get a model that, when presented with a sequence of images, can categorize each one of them with a 0 or a 1 if they have a Medicane or not.

The model does compile without any errors but the results it provides are horrible:

.

What am I doing incorrectly? Is there a more effective way to write all of this?

Original Q&A

Image sequence detection with Keras, Convolutional and Stateful Neural Network

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in KERAS

Related Questions in CONV-NEURAL-NETWORK

Related Questions in LSTM-STATEFUL

Trending Questions

Popular # Hahtags

Popular Questions