I have a TensorFlow Keras model that looks something like this. Note that this model is too big to run on a microcontroller which is my end goal but this is a starting point. The end goal is to get something small enough to fit on a memory constrained microcontroller with no floating point unit. I'm using TensorFlow version 2.11.0
new_model = tf.keras.Sequential()
new_model.add(Conv2D(filters=128, kernel_size=[2, 2], activation='relu', input_shape=trainingData[0].shape))
new_model.add(Dropout(0.1))
new_model.add(Conv2D(filters=128, kernel_size=[2, 2], activation='relu'))
new_model.add(Dropout(0.2))
new_model.add(Flatten())
new_model.add(Dense(128, activation='relu'))
new_model.add(Dropout(0.5))
new_model.add(Dense(3, activation='softmax'))
new_model.compile(optimizer=Adam(learning_rate=0.001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
I'm not including the training code here since its working and I am able to run predictions on test data successfully. If its useful I'll add it. The model is saved as an .h5 file and loaded back in when doing the conversion to TensorFlow Lite.
The input data is 25 X, Y, and Z axis values scaled between -128...127. They are loaded from a .csv file with
data = pd.read_csv('../../TrainingData/training_raw.csv',
dtype={'X': np.int8, 'Z': np.int8, 'Y': np.int8, 'class': np.int8}, nrows=1000)
And then converted to frames with code like this
def convert_samples_to_frames(sensor_data, frame_size, hop_size):
frames = []
labels = []
data_length = len(sensor_data)
for i in range(0, data_length - frame_size, hop_size):
x = sensor_data['X'][i: i + frame_size]
y = sensor_data['Y'][i: i + frame_size]
z = sensor_data['Z'][i: i + frame_size]
label_statistics = stats.mode(sensor_data['class'][i: i + frame_size])
label = label_statistics[0][0]
frames.append([x, y, z])
labels.append(label)
frames = np.asarray(frames)
labels = np.asarray(labels)
return frames, labels
frames, labels = convert_samples_to_frames(data, 25, 10)
The conversion seems to work but is not full integer with this code
model = load_model('my_model.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
tflite_model = converter.convert()
The problem comes in when I try to add a representative dataset for full 8 bit quantization. I've tried a few things. The first attempt is just to simple give it the frames from the sample data. I tried something like this. Note that this happens before I call converter.convert() of course.
def representative_dataset_gen():
for i in range(100):
data = frames[i]
yield [data]
converter.representative_dataset = representative_dataset_gen
With this code I get this error
RuntimeError: tensorflow/lite/kernels/conv.cc:343 input->dims->size != 4 (2 != 4)Node number 0 (CONV_2D) failed to prepare.
I was guessing that maybe it had something to do with the Conv2D first layer with a kernel_size=[2, 2] needing more than 1 frame. Although on retrospect I'm not sure this makes sense. Anyhow I tried something along these lines.
def representative_dataset_gen():
for i in range(100):
data = frames[i:i+2]
yield [data.reshape(1, 2, 3, 25)]
Which caused it to fail with a different error
Attempting to resize dimension 1 of tensor 0 with value 3 to 2. ResizeInputTensorStrict only allows mutating unknown dimensions identified by -1.
Also tried this
def representative_dataset_gen():
for i in range(100):
data = frames[i]
data = tf.expand_dims(data, 0)
yield [data]
Which caused it to fail with
RuntimeError: tensorflow/lite/kernels/conv.cc:343 input->dims->size != 4 (3 != 4)Node number 0 (CONV_2D) failed to prepare.
Also tried
def representative_dataset_gen():
for i in range(100):
data = np.array(frames[i:i+2], dtype=np.float32).reshape(2, 1, 3, 25)
yield [data]
Which gives this error
RuntimeError: tensorflow/lite/core/subgraph.cc BytesRequired number of elements overflowed. Node number 1 (CONV_2D) failed to prepare.
Note that eventually I think I need to add something along the lines of
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
but for the sake of figuring this out I'm taking baby steps and leaving this out.
If anyone has any suggestions as to how to properly set up my representative dataset or any other tips for getting inference like this to work on a microcontroller with no floating point unit I would greatly appreciate the help.
I see there was an issue in 2.4 but it looks like it was resolved and since I'm using 2.11 I think it does not pertain to me.
I somehow got it to work by adding an input layer and reshape layer to my model