python - PyTorch vs MXNet: Significant Performance Discrepancy in Graph Neural Network Implementation - Stack Overflow

IT技术

更新时间：2025-01-088

admin管理员组
文章数量:1122832

I am converting a Graph Neural Network (GNN) model from MXNet to PyTorch and experiencing significant performance discrepancies during training. The same architecture and data yield much worse results in PyTorch compared to the MXNet implementation. Github implementaion at:

The model is a Spatio-Temporal Graph Convolutional Network (ST-GNN) used for traffic forecasting. The input data has the shape (num_samples, time_steps, num_nodes, num_features). The adjacency matrix (adj_mx) is constructed using a combination of connectivity and dynamic time warping (DTW) matrices. Training Results: MXNet Results (First 3 Epochs):

training: Epoch: 1, RMSE: 129.24, MAE: 96.25, time: 128.01 s
validation: Epoch: 1, loss: 40.85, time: 144.13 s
test: Epoch: 1, MAE: 39.81, MAPE: 29.27, RMSE: 58.48, time: 165.47s

training: Epoch: 2, RMSE: 50.62, MAE: 34.54, time: 124.97 s
validation: Epoch: 2, loss: 32.37, time: 140.61 s
test: Epoch: 2, MAE: 31.57, MAPE: 23.59, RMSE: 45.88, time: 161.53s

training: Epoch: 3, RMSE: 43.97, MAE: 29.62, time: 125.43 s
validation: Epoch: 3, loss: 31.18, time: 141.05 s
test: Epoch: 3, MAE: 30.52, MAPE: 24.56, RMSE: 44.58, time: 161.73s

PyTorch Results (First 3 Epochs):

Training: Epoch: 1, Loss: 225979.7226, Time: 41.94s
Validation: Epoch: 1, Loss: 222.6407, Time: 46.58s
Test: Epoch: 1, MAE: 219.72, MAPE: 101.28, RMSE: 270.32, Time: 53.61s

Training: Epoch: 2, Loss: 62675.5868, Time: 42.25s
Validation: Epoch: 2, Loss: 155.2467, Time: 47.07s
Test: Epoch: 2, MAE: 154.06, MAPE: 91.94, RMSE: 206.60, Time: 55.12s

Training: Epoch: 3, Loss: 30421.8932, Time: 42.49s
Validation: Epoch: 3, Loss: 130.6325, Time: 47.29s
Test: Epoch: 3, MAE: 127.57, MAPE: 147.89, RMSE: 166.99, Time: 55.14s

Config:

{
    "module_type": "individual",
    "act_type": "GLU",
    "temporal_emb": true,
    "spatial_emb": true,
    "use_mask": true,
    "first_layer_embedding_size": 64,
    "filters": [
        [64, 64, 64],
        [64, 64, 64],
        [64, 64, 64]
    ],
    "batch_size": 32,
    "optimizer": "adam",
    "learning_rate": 1e-3,
    "epochs": 200,
    "max_update_factor": 1,
    "ctx": 0,
    "adj_filename": "./data/PEMS04/PEMS04.csv",
    "id_filename": null,
    "graph_signal_matrix_filename": "./data/PEMS04/PEMS04.npz",
    "num_of_vertices": 307,
    "points_per_hour": 12,
    "num_for_predict": 12,
    "num_of_features": 1,
    "adj_dtw_filename": "./data/adj_PEMS04_001.csv"
}

Verified that the adjacency matrix (adj_mx) is consistent in shape and values between MXNet and PyTorch. Ensured that the model architecture matches, including layers, activations (GLU), and the number of parameters. Used the same loss function (Huber Loss) and optimizer (Adam with lr=0.001) in both implementations. Checked the input data normalization to ensure consistency between frameworks. Confirmed that weight initialization in PyTorch uses Xavier initialization, as in MXNet.

What could cause such a significant performance gap between the PyTorch and MXNet implementations of the same model? Are there specific areas in PyTorch (e.g., matrix operations, batching, or gradient computation) that require attention when porting from MXNet? Any suggestions for debugging or aligning the two implementations more closely?

本文标签：

版权声明：本文标题：python - PyTorch vs MXNet: Significant Performance Discrepancy in Graph Neural Network Implementation - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736304125a1932125.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

最实用的雨林木风Win10系统推荐与下载指南

编程

1天前

最实用的雨林木风Win10系统推荐与下载指南在操作系统领域，Windows 10凭借其强大的功能、丰富的应用生态以及良好的用户体验，一直深受用户的喜爱。而雨林木风作为知名的系统优化与定制团队，其推出的Win10系统更是以其定制化、简洁性

PyCharm安装激活教程(Jetbrains其它软件可参考)

编程

1天前

PyCharm安装激活教程 PyCharm安装激活教程1.python基础环境安装配置1.1 下载及安装 2.PyCharm安装及激活教程2.1 学生教师安装（有学信网edu邮箱）及激活2.1

win11使用优化-这后，就可以放弃win10了

编程

1天前

如果使用没有改造的win11，我是很不习惯的。第一个没有win10的磁贴，又没有win7的开始菜单（我个人觉得，这两个系统的开始菜单功能是做的很不错的），但是win11像我们搞开发的，那一堆的破软件，win11的菜单顶多18个, 这让我

colors - How do I create CSS gradients that follow the square root average? - Stack Overflow

IT技术

1天前

This question stems from this minutephysics video I watched a while back: Computer Color is BrokenIt d

python 3.x - AWS Lambda code to connect with EKS cluster - Stack Overflow

IT技术

23小时前

I have a lambda code in python (v3.13) which is trying to connect to an AWS EKS cluster to run a job. T

active directory - samba-tool GPO scripts - Stack Overflow

IT技术

23小时前

I have a Samba server set up as a secondary domain controller and an Active Directory server as the pri

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

编程

21小时前

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！ 【下载地址】物理网卡MAC修改器v3.0-真实网卡硬件MAC地址修改重装系统不变本仓库提供了一个强大的工具——物理网

android - How to build AOSP 13 at Intel 285k without errors - Stack Overflow

IT技术

20小时前

got constant crash with build AOSP 13 at intel 285k and 265k. (total different hardware set with differ

assembly - Calling the world's simplest NASM function from C - segfault - Stack Overflow

IT技术

20小时前

I'm trying to learn x86-64 assembly on linux, using NASM with gcc. I've made just about the s

raspberry pi - FFmpeg h264_v4l2m2m encoder changing aspect ratio from 16:9 to 1:1 with black bars - Stack Overflow

IT技术

19小时前

When switching from libx264 to h264_v4l2m2m encoder in FFmpeg for YouTube streaming, the output video&#

Java入门级教学（IDEA的下载与安装与JDK的环境配置）

编程

19小时前

1.JDK的下载与安装 jdk的安装链接分为不同操作系统如下,点击链接跳转下载页面： windows操作系统JDK下载链接(按住键盘ctrl键单击链接即可)： 链接7天有效&#xff

How do I partition disks in a VM instance using cloud-init - Stack Overflow

IT技术

18小时前

I am unable to partition a disk using cloud-init on an instance in the Oracle cloud. No matter what I t

promql - Prometheus - how to group by lable 2 metrics and filter one with another? - Stack Overflow

IT技术

18小时前

I have 2 metrics:levels{set_id, instance_id}levels_expected{set_id}I need to group both by set_id and

c# - Printing Popup Hangs over 5 seconds for each page - Stack Overflow

IT技术

18小时前

Our problem is while printing after the calculations, windows printing popup hangs over 5 seconds then

scalatest - Scala-cli test doesnt exit after test run - Stack Overflow

IT技术

16小时前

I have some basic tests that i am executing with scala cli.When i run the tests scala-cli test core w

linux - Do all fragments of an IP packet greater than MTU carry the full PPPoE header when modified in an eBPF tc program? - Sta

IT技术

16小时前

I hope you are doing well. I am working with eBPF and tc on the egress side to add a PPPoE header to fo

ios - Sending "Start" Live Activity Notification from Apple Push Notifications Console successfully received b

IT技术

15小时前

Resorting to asking here since it seems that there's not a lot of documentation around debugging &

CC++ encode binary into utf8 - Stack Overflow

IT技术

3小时前

I have a block of text data, almost all of which is valid utf8. Almost all -- but not all. It contains

New Python Instance in VS Code and the terminal is passing indentions that do not exist in the code editor window - Stack Overfl

IT技术

1小时前

I have a very weird issue affecting my code.I'm getting set up on a new machine, and in VS Code

multithreading - C++ thread exiting without a notice -- need help debugging with gdb - Stack Overflow

IT技术

1小时前

I have a multithreaded program in C++.Here's a brief pseudo-code of the important bits and pieces

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

python - PyTorch vs MXNet: Significant Performance Discrepancy in Graph Neural Network Implementation - Stack Overflow

更多相关文章

最实用的雨林木风Win10系统推荐与下载指南

PyCharm安装激活教程(Jetbrains其它软件可参考)

win11使用优化-这后，就可以放弃win10了

colors - How do I create CSS gradients that follow the square root average? - Stack Overflow

python 3.x - AWS Lambda code to connect with EKS cluster - Stack Overflow

active directory - samba-tool GPO scripts - Stack Overflow

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

android - How to build AOSP 13 at Intel 285k without errors - Stack Overflow

assembly - Calling the world&#39;s simplest NASM function from C - segfault - Stack Overflow

raspberry pi - FFmpeg h264_v4l2m2m encoder changing aspect ratio from 16:9 to 1:1 with black bars - Stack Overflow

Java入门级教学（IDEA的下载与安装与JDK的环境配置）

How do I partition disks in a VM instance using cloud-init - Stack Overflow

promql - Prometheus - how to group by lable 2 metrics and filter one with another? - Stack Overflow

c# - Printing Popup Hangs over 5 seconds for each page - Stack Overflow

scalatest - Scala-cli test doesnt exit after test run - Stack Overflow

linux - Do all fragments of an IP packet greater than MTU carry the full PPPoE header when modified in an eBPF tc program? - Sta

ios - Sending &quot;Start&quot; Live Activity Notification from Apple Push Notifications Console successfully received b

CC++ encode binary into utf8 - Stack Overflow

New Python Instance in VS Code and the terminal is passing indentions that do not exist in the code editor window - Stack Overfl

multithreading - C++ thread exiting without a notice -- need help debugging with gdb - Stack Overflow

发表评论

推荐文章

multisite - ExportImport Blog Post from and to sites with different themes?

url rewriting - I want to change my custom post url dynamically in WordPress - Stack Overflow

java - In readyApi can we dynamically update expected result of a Xquery Match Assertion using Script - Stack Overflow

plugins - Redirect OLDPath to NEWURLPath

c - Are forward declarations needed when the typedef declaration is done? - Stack Overflow

热门文章

machine learning - Using f64 in Burn tensor in Rust - Stack Overflow

突发！ChatGPT推出Windows客户端！快捷键召唤，编写代码更流畅、论文小助手

mysql - I imported an restore an database and It shows #1067 - Invalid default value for &#39;user_registered&#39;

nonce - “The link you followed has expired” when previewing a post

php - taxonomy pages returning “NULL” when running default WordPress function &#39;get_queried_object()&#39;

redirect - Forward and mask implementation not working,

highcharts - how to make &quot;HighchartMap&quot; change to &quot;OpenStreetMap&quot; after several scrolls of t

search - How to Prevent ?keyword Parameter from Being Indexed and Stop Its Generation in WordPress?

Receiving error : strip exited with code 72 when compiling MAUI project - Stack Overflow

templates - Custom group pattern layout is not showing in posts

最新文章

Java入门级教学（IDEA的下载与安装与JDK的环境配置）

华硕笔记本电脑用U盘重装windows系统

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

如何一键安装win7系统(一键安装win7系统步骤)

Windows 11最稳定版本详解

multithreading - C++ thread exiting without a notice -- need help debugging with gdb - Stack Overflow

apache kafka - Unknown feature gate KafkaNodePools found in the configuration - Stack Overflow

New Python Instance in VS Code and the terminal is passing indentions that do not exist in the code editor window - Stack Overfl

ros2 - how to modify imu_filter_madgwick to transform RPY from imu_sensor frame to base_link frame? - Stack Overflow

Color a portion of a minipage in Manim - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

assembly - Calling the world's simplest NASM function from C - segfault - Stack Overflow

ios - Sending "Start" Live Activity Notification from Apple Push Notifications Console successfully received b

mysql - I imported an restore an database and It shows #1067 - Invalid default value for 'user_registered'

php - taxonomy pages returning “NULL” when running default WordPress function 'get_queried_object()'

highcharts - how to make "HighchartMap" change to "OpenStreetMap" after several scrolls of t