Does Automatic MIXED PRECISION (AMP) half the paramters of a model?

881 Views Asked by lee Lin At 08 November 2025 at 22:04

Before I knew about automatic mixed precision, I manually halved the model and data using half() for training with half precision. But the training result is not good at all.

Then I used the automatic mixed precision to train a network, which returns decent results. But when I save the checkpoint, the parameters in the checkpoints are still in fp32. I want to save a checkpoint with fp16. Therefore, I want to ask if and how I can save the checkpoints with fp16. And this also makes me wonder: when performing conv2d with autocast, are the parameters of conv2d also halved? Or is it only the data that is halved?

Original Q&A

There are 1 best solutions below

Rafael Toledo On 30 March 2023 at 22:54

It does not apply "half" for all parameters. It analyzes each layer separately and some work with FP16 and others with FP32.

From the documentation here.

torch.cuda.amp provides convenience methods for mixed precision, where some operations use the torch.float32 (float) datatype and other operations use torch.float16 (half). Some ops, like linear layers and convolutions, are much faster in float16 or bfloat16. Other ops, like reductions, often require the dynamic range of float32. Mixed precision tries to match each op to its appropriate datatype, which can reduce your network’s runtime and memory footprint.

About the checkpoints, a copy of the weights is maintained in the FP32 precision to be by the optimizer, as said here.

Does Automatic MIXED PRECISION (AMP) half the paramters of a model?

There are 1 best solutions below

Related Questions in PYTORCH

Related Questions in AUTOMATIC-MIXED-PRECISION

Trending Questions

Popular # Hahtags

Popular Questions