admin管理员组

文章数量:1356691

In following code

import torch
from torch.nn.functional import linear
a=torch.ones(2,3).type(torch.float16)
b=torch.ones(2,3).type(torch.float16)
linear(a,b)

what is the computetype of linear, fp32 or fp16 or other?

Thanks

I try to look into the repo and the torch.nn.functional.linear, but it is too hard.

In following code

import torch
from torch.nn.functional import linear
a=torch.ones(2,3).type(torch.float16)
b=torch.ones(2,3).type(torch.float16)
linear(a,b)

what is the computetype of linear, fp32 or fp16 or other?

Thanks

I try to look into the repo and the torch.nn.functional.linear, but it is too hard.

Share Improve this question edited Mar 31 at 10:00 talonmies 72.4k35 gold badges203 silver badges289 bronze badges asked Mar 31 at 4:05 YSFYSF 111 bronze badge New contributor YSF is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
Add a comment  | 

1 Answer 1

Reset to default 0

The computation will be performed in fp32 (float32) by default (https://www.exxactcorp/blog/hpc/what-is-fp64-fp32-fp16), even though your inputs are fp16 (float16). This is PyTorch's default behavior for numerical stability reasons.

Why fp32 by Default:
Reduced precision (fp16) can lead to numerical instability (overflow/underflow)
Many operations in PyTorch use fp32 internally for accumulation even with fp16 inputs

You could verify:

import torch
from torch.nn.functional import linear
a = torch.ones(2, 3, dtype=torch.float16)
b = torch.ones(4, 3, dtype=torch.float16)
output = linear(a, b)
print(output.dtype)

Expected Output:

  • On CPU (Newer version)→ torch.float16

  • On GPU (pre-Ampere) → torch.float32

  • On Ampere+ GPUs (with autocast enabled) → torch.float16

When both inputs (a and b) are in torch.float16, PyTorch automatically upcasts computations to torch.float32 by default for numerical stability, especially on CPU and some GPUs (e.g., older architectures).

If running on Ampere (or newer) GPUs with Tensor Cores enabled, the computation might stay in fp16 for efficiency.

本文标签: pythonWhat is the computetype of torchnnfunctionallinear when input is float16 or bfloat16Stack Overflow