algorithm - How to efficiently perform dynamic programming with complex state dependencies in Python? - Stack Overflow

IT技术

更新时间：2025-03-190

admin管理员组
文章数量:1333693

I am working on a Python project that involves implementing a dynamic programming (DP) algorithm, but the state dependencies are not straightforward. Here's a simplified version of my problem:

I need to calculate the minimum cost to traverse a 2D grid where each cell has a cost, but the movement rules are unusual:

You can move down, right, or diagonally down-right. Moving diagonally has an extra penalty depending on the sum of the costs of the starting and ending cells. Additionally, the cost to move into a cell may depend on whether the previous move was horizontal, vertical, or diagonal. For example: If grid[i][j] is the cost of cell (i, j), then the cost to reach (i, j) from (i-1, j-1) (diagonal) would be:

dp[i][j] = dp[i-1][j-1] + grid[i][j] + penalty_function(grid[i-1][j-1], grid[i][j])

But from (i-1, j) (vertical), it would simply be:

dp[i][j] = dp[i-1][j] + grid[i][j]

I attempted the following approach:

def min_cost(grid):
    rows, cols = len(grid), len(grid[0])
    dp = [[float('inf')] * cols for _ in range(rows)]
    dp[0][0] = grid[0][0]  # Starting point

    for i in range(1, rows):
        for j in range(1, cols):
            vertical = dp[i-1][j] + grid[i][j]
            horizontal = dp[i][j-1] + grid[i][j]
            diagonal = dp[i-1][j-1] + grid[i][j] + penalty_function(grid[i-1][j-1], grid[i][j])
            dp[i][j] = min(vertical, horizontal, diagonal)

    return dp[-1][-1]

However, this becomes inefficient for larger grids because the penalty function itself can be computationally expensive, and the solution doesn't scale well when the grid size exceeds 1000x1000.

Is there a way to optimize this DP approach, possibly by memoizing or precomputing parts of the penalty function? Would switching to libraries like NumPy or using parallel processing help in this scenario? Are there Python-specific tricks (e.g., @functools.lru_cache, generators) that I could use to improve performance while keeping the code clean and readable?

I am working on a Python project that involves implementing a dynamic programming (DP) algorithm, but the state dependencies are not straightforward. Here's a simplified version of my problem:

I need to calculate the minimum cost to traverse a 2D grid where each cell has a cost, but the movement rules are unusual:

dp[i][j] = dp[i-1][j-1] + grid[i][j] + penalty_function(grid[i-1][j-1], grid[i][j])

But from (i-1, j) (vertical), it would simply be:

dp[i][j] = dp[i-1][j] + grid[i][j]

I attempted the following approach:

def min_cost(grid):
    rows, cols = len(grid), len(grid[0])
    dp = [[float('inf')] * cols for _ in range(rows)]
    dp[0][0] = grid[0][0]  # Starting point

    for i in range(1, rows):
        for j in range(1, cols):
            vertical = dp[i-1][j] + grid[i][j]
            horizontal = dp[i][j-1] + grid[i][j]
            diagonal = dp[i-1][j-1] + grid[i][j] + penalty_function(grid[i-1][j-1], grid[i][j])
            dp[i][j] = min(vertical, horizontal, diagonal)

    return dp[-1][-1]

However, this becomes inefficient for larger grids because the penalty function itself can be computationally expensive, and the solution doesn't scale well when the grid size exceeds 1000x1000.

Share Improve this question asked Nov 20, 2024 at 12:46 Plamen Nikolov 33 bronze badges

1 We need to see the code of the penalty_function to help. – MrSmith42 Commented Nov 20, 2024 at 13:12
Pure-Python code is very slow because it manipulates dynamically-allocated reference-counted objects and the default CPython implementation is a slow interpreter which optimize nearly nothing. Lists of lists are slow because you need two fetch of dynamic objects, type checking, dynamic function calls etc. Please don't du such computation in CPython (except maybe for small data or prototyping). – Jérôme Richard Commented Nov 20, 2024 at 18:00
Vectorizing this code with Numpy is complicated but you can use Numpy+Cython (with views) or Numpy+Numba. The resulting compiled/JITed code should be at least 10 times faster (probably much more in practice like 100 times). Not to mention the computation can be parallelized (not easy though, unless you write a native C/C++ module using OpenMP). All of this assume penalty_function can be JITed/compiled too (otherwise the function call overhead will be the main bottleneck anyway). – Jérôme Richard Commented Nov 20, 2024 at 18:03
@JérômeRichard Writing pypy code looks just like using a restricted subset of Python, and is plenty fast. – btilly Commented Nov 20, 2024 at 18:15
@btilly PyPy help a lot because it is a JIT as opposed to CPython. But it is not as good as Numba mainly because PyPy is less restricted. If you are not convainced, try to write a matrix multiplication code with PyPy and lists. I did it. PyPy does not really vectorize the code (ie. use SIMD instruction) mainly because this is expensive (and PyPy is a tracing JIT, not a method JIT) and also because of a lot of guards (use to check types and non usual things). Because of that the optimized PyPy code was an order of magnitude slower than optimized native code (including Numba as fast as C) – Jérôme Richard Commented Nov 20, 2024 at 19:20

| Show 9 more comments

2 Answers 2

Sorted by: Reset to default 0

DP can proceed in two ways.

The first is bottom up. That's harder but often more efficient on memory.

The second is top down. Just write a recursive function, then memoize it.

If you're struggling with bottom up, just try top down.

In your case, though, this will on a 1000x1000 grid require a million data values, which you access in strange patterns. Bottom up makes more sense. Instead of the previous suggestion of rows, I would suggest diagonals. At the worst point it is the same as rows. But usually it is smaller, improving cache usage.

You don't need to store the whole 2D array of dp values. Your algorithm only needs the current and previous rows. The computational complexity isn't changed by using only 2 rows, but the space complexity is so in practice, less memory requirement will probably give a performance boost.

本文标签：

版权声明：本文标题：algorithm - How to efficiently perform dynamic programming with complex state dependencies in Python? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1742355863a2459394.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

algorithm - How to efficiently perform dynamic programming with complex state dependencies in Python? - Stack Overflow

2 Answers 2

更多相关文章

javascript - Tracking video milestones from a custom HTML5 video player to Omniture Media Module? - Stack Overflow

javascript - angularjs drag and drop plugin drop issue - Stack Overflow

File API - HEX conversion - Javascript - Stack Overflow

jquery - Breadth-first traversal in Javascript - Stack Overflow

Redirect Logged In User if page is wp-login.php and $_Get[&#39;level&#39;] = X

.net - How to notify my JS client without polling? - Stack Overflow

javascript - Catch errors with window.onerror yet see errors on console - Stack Overflow

javascript - TinyMCE: open a panel by double-clicking on text - Stack Overflow

javascript - Backbone js dynamic events with variables - Stack Overflow

datetime - How to determine whether a year is a leap year in JavaScript - Stack Overflow

javascript - why does this piece of js throw a DOM Exception? - Stack Overflow

How to get the first value in a javascript array - Stack Overflow

Word press cutting sliding images - not displaying the entire image

c - Math error conditions (C99, C11, etc.) in GCC - Stack Overflow

ios - UITableViewCell &#39;s content height constraint can not refresh in reloadData? - Stack Overflow

c++ - When using vscode, how to jump into wanted implement with the same name&#39;s functions? - Stack Overflow

javascript - Refocus CKEditor at correct cursor position - Stack Overflow

javascript - Automatically print a webpage to pdf - Stack Overflow

javascript - Finding an empty cell in a column using google sheet script - Stack Overflow

flutter - ModalBottomSheet with images from the gallery - Stack Overflow

发表评论

推荐文章

Function with Python type hint which accepts a class A and outputs a class that inherits from A - Stack Overflow

java - Open print dialog automatically when PDF opened, using iText - Stack Overflow

javascript - append single quotes to characters - Stack Overflow

javascript - Unable to mock axios call in Jest - Stack Overflow

iOS Safari javascript audio recording - Stack Overflow

热门文章

NestJs Puppeteer doesn&#39;t work on docker with Alpine Linux (in AWS) - Stack Overflow

Challenging vector maths in JavaScript - Stack Overflow

javascript - When can I consider a RTCPeerConnection to be disconnected? - Stack Overflow

javascript - React Native : state is getting undefined inside useCallBack hook function - Stack Overflow

javascript - How to fix: Cannot read property &#39;GoogleAuthProvider&#39; of undefined - Stack Overflow

plugins - Mixed content error after adding SSL certificate

amazon - WordPress hosted on AWS EC2

javascript - Checkboxes not working in react js - Stack Overflow

how add more field to wp user and save it to database

I need help understanding a simple JavaScript script - Stack Overflow

最新文章

戴尔电脑重装Win10的方法详解

Win10系统在线重装的详细教程

联想小新潮5000--- --- UEFI+GPT 下 重装Win10系统

机械革命笔记本重装Win10系统详细教程

Chatgpt4.0国内使用网站公开。免费的都是假的。

javascript - How to get the value between dollar sign in string? - Stack Overflow

flutter - ModalBottomSheet with images from the gallery - Stack Overflow

javascript - Finding an empty cell in a column using google sheet script - Stack Overflow

javascript - JQuery Slider for Mobile application - Stack Overflow

javascript - Automatically print a webpage to pdf - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

Redirect Logged In User if page is wp-login.php and $_Get['level'] = X

ios - UITableViewCell 's content height constraint can not refresh in reloadData? - Stack Overflow

c++ - When using vscode, how to jump into wanted implement with the same name's functions? - Stack Overflow

NestJs Puppeteer doesn't work on docker with Alpine Linux (in AWS) - Stack Overflow

javascript - How to fix: Cannot read property 'GoogleAuthProvider' of undefined - Stack Overflow

联想小新潮5000--- --- UEFI+GPT 下重装Win10系统