python - Gradient computation with PyTorch autograd with 1st and 2nd order derivatives does not work - Stack Overflow-软件玩家

admin管理员组
文章数量:1295693

I am having a weird issue with PyTorch's autograd functionality when implementing a custom loss calculation on a second order differential equation. In the code below, predictions of the neural network are checked if they satisfy a second order differential equation. This works fine. However, when I want to calculate the gradient of the loss with respect to the predictions, I get an error indicating that there seems to be no connection between loss and u in the computational graph.

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

This doesn't make sense because the loss is directly dependent and calculated with the prior derivatives that originate from u. Deriving the loss with respect to u_xx and u_t works, deriving to u_x does NOT. We verified that .requires_grad is set to True for all variables (X, u, u_d, u_x, u_t, u_xx).

Why does this happen, and how to fix this?

Main code:

# Ensure X requires gradients
X.requires_grad_(True)

# Get model predictions
u = self.pinn(X)

# Compute first-order gradients (∂u/∂x and ∂u/∂t)
u_d = torch.autograd.grad(
    u,
    X,
    grad_outputs=torch.ones_like(u),
    retain_graph=True,
    create_graph=True,  # Allow higher-order differentiation
)[0]

# Extract derivatives
u_x, u_t = u_d[:, 0], u_d[:, 1]  # ∂u/∂x and ∂u/∂t

# Compute second-order derivative ∂²u/∂x²
u_xx = torch.autograd.grad(
    u_x,
    X,
    grad_outputs=torch.ones_like(u_x),
    retain_graph=True,
    create_graph=True,
)[0][:, 0]

# Diffusion equation (∂u/∂t = κ * ∂²u/∂x²)
loss = nn.functional.mse_loss(u_t, self.kappa * u_xx)

## THIS FAILS
# Compute ∂loss/∂u
loss_u = torch.autograd.grad(
    loss,
    u,
    grad_outputs=torch.ones_like(loss),
    retain_graph=True,
    create_graph=True,
)[0]

# Return error on diffusion equation
return loss

Model:

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
Sequential                               [1, 1]                    --
├─Linear: 1-1                            [1, 50]                   150
├─Tanh: 1-2                              [1, 50]                   --
├─Linear: 1-3                            [1, 50]                   2,550
├─Tanh: 1-4                              [1, 50]                   --
├─Linear: 1-5                            [1, 50]                   2,550
├─Tanh: 1-6                              [1, 50]                   --
├─Linear: 1-7                            [1, 50]                   2,550
├─Tanh: 1-8                              [1, 50]                   --
├─Linear: 1-9                            [1, 1]                    51
==========================================================================================
Total params: 7,851
Trainable params: 7,851
Non-trainable params: 0
Total mult-adds (M): 0.01
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.03
Estimated Total Size (MB): 0.03
==========================================================================================

What we have already tried:

Reverted to an older PyTorch version (tested on 2.5.0, and 1.13.1). Same issue.

Putting .requires_grad_(True) after every variable assignment. This did not help.

We also tried to replace the tensor slicing by multiplying with zero/one vectors without results. We though this slicing might disturb the computational graph breaking the connection to u.

# Extract derivatives
u_x, u_t = u_d[:, 0], u_d[:, 1]  # ∂u/∂x and ∂u/∂t

# Extract derivatives alternative
u_x = torch.sum(
    torch.reshape(torch.tensor([1, 0], device=u_d.device), [1, -1]) * u_d,
    dim=1,
    keepdim=True,
)
u_t = ...

Thanks for your help!

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

Why does this happen, and how to fix this?

Main code:

# Ensure X requires gradients
X.requires_grad_(True)

# Get model predictions
u = self.pinn(X)

# Compute first-order gradients (∂u/∂x and ∂u/∂t)
u_d = torch.autograd.grad(
    u,
    X,
    grad_outputs=torch.ones_like(u),
    retain_graph=True,
    create_graph=True,  # Allow higher-order differentiation
)[0]

# Extract derivatives
u_x, u_t = u_d[:, 0], u_d[:, 1]  # ∂u/∂x and ∂u/∂t

# Compute second-order derivative ∂²u/∂x²
u_xx = torch.autograd.grad(
    u_x,
    X,
    grad_outputs=torch.ones_like(u_x),
    retain_graph=True,
    create_graph=True,
)[0][:, 0]

# Diffusion equation (∂u/∂t = κ * ∂²u/∂x²)
loss = nn.functional.mse_loss(u_t, self.kappa * u_xx)

## THIS FAILS
# Compute ∂loss/∂u
loss_u = torch.autograd.grad(
    loss,
    u,
    grad_outputs=torch.ones_like(loss),
    retain_graph=True,
    create_graph=True,
)[0]

# Return error on diffusion equation
return loss

Model:

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
Sequential                               [1, 1]                    --
├─Linear: 1-1                            [1, 50]                   150
├─Tanh: 1-2                              [1, 50]                   --
├─Linear: 1-3                            [1, 50]                   2,550
├─Tanh: 1-4                              [1, 50]                   --
├─Linear: 1-5                            [1, 50]                   2,550
├─Tanh: 1-6                              [1, 50]                   --
├─Linear: 1-7                            [1, 50]                   2,550
├─Tanh: 1-8                              [1, 50]                   --
├─Linear: 1-9                            [1, 1]                    51
==========================================================================================
Total params: 7,851
Trainable params: 7,851
Non-trainable params: 0
Total mult-adds (M): 0.01
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.03
Estimated Total Size (MB): 0.03
==========================================================================================

What we have already tried:

Reverted to an older PyTorch version (tested on 2.5.0, and 1.13.1). Same issue.

Putting .requires_grad_(True) after every variable assignment. This did not help.

We also tried to replace the tensor slicing by multiplying with zero/one vectors without results. We though this slicing might disturb the computational graph breaking the connection to u.

# Extract derivatives
u_x, u_t = u_d[:, 0], u_d[:, 1]  # ∂u/∂x and ∂u/∂t

# Extract derivatives alternative
u_x = torch.sum(
    torch.reshape(torch.tensor([1, 0], device=u_d.device), [1, -1]) * u_d,
    dim=1,
    keepdim=True,
)
u_t = ...

Thanks for your help!

Share edited Feb 12 at 14:06 simon 5,5411 gold badge16 silver badges29 bronze badges asked Feb 12 at 13:05 Wout Rombouts 1,7392 gold badges12 silver badges18 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

So I think that the issue is how you extract the derivatives. Whether you use the slicing version or the alternative you mentioned the same issue persists, being the resulting u_x and its subsequent u_xx are not used directly in the loss fn beyond u_xx, so torch can't find how to get from u to u_xx through u_x.

This leaves you with two options:

If it is acceptable that part of u's gradient is unused (0) you can set the allow_unused flag to True in the calculation of u's loss

loss_u = torch.autograd.grad(loss, u, grad_outputs=torch.ones_like(loss), allow_unused=True, retain_graph=True, create_graph=True)[0]

If you want the gradient to fully propogate through u including u_x, you can add an extra loss term that enforces the dependency

u_x_loss = alpha * mse_loss(u_x, u_x.detach())  
loss = mse_loss(u_t, self.kappa * u_xx) + u_x_loss

Note: this extra u_x_loss term does not need to be mse_loss specifically, but any loss fn that enforces the dependency (e.g. MAE loss, Gradient penalty, etc.)

you can set alpha to be really small if you don't want it to have a large effect, but be careful if it's too small you can get weird values.

TLDR

doesn't matter how you extract the derivatives either with slicing or the scalar multiplication alternative, you are breaking the graph in either case. Either allow unused values in the gradient calculation, or add a u_x loss term to enforce the graph remaining intact.

本文标签：

版权声明：本文标题：python - Gradient computation with PyTorch autograd with 1st and 2nd order derivatives does not work - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1741597462a2387498.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - Gradient computation with PyTorch autograd with 1st and 2nd order derivatives does not work - Stack Overflow

1 Answer 1

TLDR

更多相关文章

javascript - How to add passive option to addEventListener in TypeScript? - Stack Overflow

javascript - Can the browser turned headless mid-execution when it was started normally, or vice-versa? - Stack Overflow

javascript - Gulp, Mocha, Watch Not Reloading My Source Files - Stack Overflow

users - Nickname field isn&#39;t appearing in Admin

What is this JavaScript reference syntax used in Visual Studio? - Stack Overflow

python - How to make alembic autogenerate foreign key relations? - Stack Overflow

How to access post meta on the first time a post is published

javascript - Redux middleware design re: return values - Stack Overflow

javascript - how does the path work in css and js - Stack Overflow

loop - Prevent Duplicate Post Counted by Query

javascript - AngularJS catch all status-codes for $http actions? - Stack Overflow

javascript - Custom case insensitive sort function that retains original casing? - Stack Overflow

asynchronous - How create abstraction over trait with async functions? - Stack Overflow

customization - Custom search page and search by title, content and tag

javascript - How do I rotate this shape? - Stack Overflow

aframe - How to get click event with javascript from an A-frame element - Stack Overflow

javascript - Error in IE when manually calling event handler, given certain conditions - Stack Overflow

java - Wear OS HealthServices only return HeartRate when requesting supported capabilities - Stack Overflow

Facebook API: Login using JavaScript SDK then checking login state with PHP - Stack Overflow

plugins - Hide all Admin Notices and move on a separate page

发表评论

推荐文章

javascript - Checkboxes will only work on current pagination page in jQuery datatables - Stack Overflow

c# - What is the ackDeadline property on Subscription and ackDeadline on Subscriber in GCP PubSub? - Stack Overflow

javascript - HTML detect change on style of element - Stack Overflow

javascript - bootstrap alert not working - Stack Overflow

plugin development - Send user activation email when programmatically creating user

热门文章

javascript - angularjs with client side haml - Stack Overflow

javascript - Efficient way to circle through an array? - Stack Overflow

javascript - Override .load function of jQuery - Stack Overflow

Custom procotol in Electron to replace file protocol for Angular app in production - Stack Overflow

wp query - Extending woocommerce admin product search

json - MVC4 razor accessing JsonResult data from my model in client side javascript - Stack Overflow

I am using Codeigniter 3 hmvc with php 8.2 and i am getting the below error - Stack Overflow

javascript - NextGen Mirth: Loop through all OBROBX segments for output to Document Writer - Stack Overflow

javascript - ESLint errors on JSXReact - Stack Overflow

Is it possible to use &lt;ServerSideRender &gt; to create a &quot;switchable&quot; preview of a Carousel Gutenbe

最新文章

Win7各正式版下载地址和SHA验证

怎么样把中文版的Windows7改成英文版的Windows7

Win7系统笔记本蓝牙打开指南：详细步骤助你轻松连接

win7开机弹计算机,win7开机弹出Windows Installer窗口的解决方法

windows7虚拟机安装vmtools方法

iife - JavaScript self-invoking function - Stack Overflow

block editor - Upgraded to wordpress 5.7 and now I can&#39;t select myself as a post author

python - Memory Usage Continues to Increase When Repeatedly Reading Parquet files - Stack Overflow

javascript - jQuery plugin - inline CSS or external stylesheet? - Stack Overflow

javascript - Angular 2: Add validators to ngModelGroup - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

users - Nickname field isn't appearing in Admin

Is it possible to use <ServerSideRender > to create a "switchable" preview of a Carousel Gutenbe

block editor - Upgraded to wordpress 5.7 and now I can't select myself as a post author