tensorflow 2.10 vs 2.12, same training script, same data, significantly worse training for 2.12

291 Views Asked by PMDP3 At 17 August 2025 at 19:20

I use this code https://www.kaggle.com/code/ritvik1909/masked-autoencoder-vision-transformer to train a network a transformer autoencoder. If I use the code under tensorflow 2.10, I obtain way better results than if I use 2.12. I don't change the code, the data are the same, the pipeline is identical and a large number of repetitions of training shows a consistent behavior both under 2.10 and 2.12.

This example image shows the training and validation for 2.10 (blue and red curves, respectively) and for 2.12 (blue and orange curves on the top). I don't know what could generate such different results if it comes from the same code. I would appreciate if someone had a method to track down the issue.

EDITS

I saw that one big difference is the change of optimizer between 2.10 and the next versions. It is still possible to use the legacy version of adam but it did not change the results.
I tried with 2.11, 2.12 and 2.13 using the docker image provided by the tensorflow team. All on the same computer, with the same architecture using the same GPU and the results are still significantly worse with versions newer than 2.10.

Original Q&A

There are 1 best solutions below

Martin Benes On 29 July 2023 at 12:38

Tensorflow of given version usually binds to specific version of Cuda drivers. Didn't you for instance switch from Cuda 10 to 11?

Tensorflow is very dynamic in selecting, e.g., convolution implementation. Also, it might have an impact if you only evaluate using a model trained on the different version, or you retrain on the new version.

tensorflow 2.10 vs 2.12, same training script, same data, significantly worse training for 2.12

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in TENSORFLOW2.0

Related Questions in UPDATES

Related Questions in SELF-ATTENTION

Trending Questions

Popular # Hahtags

Popular Questions