Float16 mixed precision being slower than regular float32, keras, tensorflow 2.0

Question

Float16 mixed precision being slower than regular float32, keras, tensorflow 2.0

229 Views Asked by Space Programmer At 07 June 2025 at 06:50

I am using Tensorflow 2.10 in windows with a NVIDIA RTX 2060 SUPER (with tensor cores) for deep learning. But when enabling mixed precision of float16 the time per epoch actually becomes slower than faster.

Code:

import tensorflow as tf
import ssl

ssl._create_default_https_context = ssl._create_unverified_context

(train_x, train_y), (test_x, test_y) = tf.keras.datasets.cifar100.load_data()

tf.keras.mixed_precision.set_global_policy("mixed_float16")

model = tf.keras.Sequential([
    
    tf.keras.layers.Lambda(lambda x : x / 255, input_shape=(32,32,3)),
    tf.keras.layers.Conv2D(filters=64, kernel_size=(4,4)),
    tf.keras.layers.MaxPool2D(),
    tf.keras.layers.Conv2D(filters=32, kernel_size=(2,2)),
    tf.keras.layers.MaxPool2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(4096, activation="relu"),
    tf.keras.layers.Dense(4096, activation="relu"),
    tf.keras.layers.Dense(4096, activation="relu"),
    tf.keras.layers.Dense(4096, activation="relu"),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(100),
    tf.keras.layers.Activation("softmax", dtype="float32")
    ])

model.compile(optimizer="adam", loss=tf.keras.losses.SparseCategoricalCrossentropy(), metrics=["accuracy"])

print("compute dtype of first layer: ", model.layers[0].compute_dtype)

model.fit(train_x, train_y, epochs=100, batch_size=1020)

model.evaluate(test_x, test_y)

I put some images of the problem: Here's an image of without using mixed precision , And here's an image using mixed precision, more slow

Running the code in Google Colab that uses a more modern version of tensorflow (TF 2.15) does work well, and is faster with mixed precision than without it (as it should be). Here's the link to the colab: Google Colab

I'm not an expert using tensorflow and I have been trying to fix this error for weeks, some help would be appreciated. Thanks!

Other Information:

I'm using cuDNN version 8.1.1 and Cuda 11.2, that are technically the compatible versions.

Original Q&A

There are 2 best solutions below

Medoalmasry On 26 December 2023 at 13:30

I am not exactly sure what is the essence of the problem here. By definition, mixed precision accelerates the learning process by using both 16-bit and 32-bit floating-point types. This is because matrix multiplications and convolutions are quite computationally expensive in 32-bits. However, performing the same calculations in 16-bits heavily reduces the bandwidth required to process the aforementioned computations. Subsequently, this entails more data being processed per epoch. Thus, less time.

Theoretically, you ought to be able to achieve the same accuracy by training the model in mixed precisions. This is because the final loss calculations are calculated in 32-bits. Subsequently, this preserves the model's ability to learn effectively.

**Space Programmer** · Accepted Answer

Space Programmer On 28 December 2023 at 15:16 BEST ANSWER

The solution I found is switch to Ubuntu (Linux) and update to the newer Tensorflow 2.15.

In this version mixed precision (float16) is twice of fast compared to the classic float32.

I also upgrade from Cuda 11.2 to 12.2 and from Cudnn 8.1.1 to 8.9

Float16 mixed precision being slower than regular float32, keras, tensorflow 2.0

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in KERAS

Related Questions in AUTOMATIC-MIXED-PRECISION

Trending Questions

Popular # Hahtags

Popular Questions