admin管理员组

文章数量:1341411

I have been working on a small Python package to solve a class of PDEs using scipy.integrate.solve_ivp. As discretizations are made finer, runtime becomes a bottleneck—especially when I need to solve the PDE for a large number of different initial conditions.

I would like to make use of GPU acceleration to speed things up, but I am unsure of how to integrate GPU-based computations into my current implementation. Here is an example of my implementation on Google Colab. In the notebook, I also tried using CuPy to transfer data to the GPU, perform a forward step, and then transfer back to the CPU, but the transfer overhead was too large.

Would I have to rewrite the solvers in something like CuPy/JAX to make use of GPUs?

I have been working on a small Python package to solve a class of PDEs using scipy.integrate.solve_ivp. As discretizations are made finer, runtime becomes a bottleneck—especially when I need to solve the PDE for a large number of different initial conditions.

I would like to make use of GPU acceleration to speed things up, but I am unsure of how to integrate GPU-based computations into my current implementation. Here is an example of my implementation on Google Colab. In the notebook, I also tried using CuPy to transfer data to the GPU, perform a forward step, and then transfer back to the CPU, but the transfer overhead was too large.

Would I have to rewrite the solvers in something like CuPy/JAX to make use of GPUs?

Share edited Feb 24 at 17:44 user572780 asked Feb 22 at 18:31 user572780user572780 3831 gold badge5 silver badges14 bronze badges 7
  • I don't think you can do that with the built-in solve_ivp. There's an argument vectorized to it, but it doesn't seem to be used for anything but numerical differentiation. You might have to write your own solver. – Nick ODell Commented Feb 22 at 20:21
  • have you considered using numba with numbakit-ode or numbalsoda instead? – Nin17 Commented Feb 22 at 21:28
  • @Nin17 From what I can tell, Numba doesn't work with many NumPy functions like FFT, which makes it impractical for PDE solving. – user572780 Commented Feb 24 at 19:02
  • it does if you install rocket-fft, you can also use unsupported functions with objmode (though maybe not in a cfunc) – Nin17 Commented Feb 24 at 20:58
  • alternatively, you could use jax and diffrax which would allow gpu computing – Nin17 Commented Feb 24 at 22:41
 |  Show 2 more comments

1 Answer 1

Reset to default 1

Consider adding some example code and focusing your questions to avoid this being closed. In the meantime, there are things I can say.

Question 1:

I've gotten ~60x speedup using a CuPy callable (Tesla P100 GPU on Colab in 2020) with solve_ivp, despite the overhead of back and forth data transfer.

As you described, at the beginning of the callable, I transfered the state to a CuPy array on the GPU, and after CuPy evaluated the time derivative, I transfered the data back to the NumPy array on the CPU. This is documented in 4.2.2 of this notebook.

Of course, it depends on your application. I was simulating all pairwise interactions between hundreds of "robots". There's not enough information here to say whether this will work well in your application.

Question 2:

This also depends on the method you're using and the dynamics. If you're using an implicit method, it sounds like you could end up with a huge but very sparse system of equations. If you can provide the jacobian and jac_sparsity, there may be hope. If you're using an explicit method and the dynamics are not stiff, it also might be OK.

本文标签: scipyHow to GPUaccelerate PDE solvers in PythonStack Overflow