admin管理员组

文章数量:1296889

I'm implementing an A* search algorithm for a maze solver in Python. Initially, I maintained the open set as a plain set and selected the node with the lowest f-score using:

current = min(open_set, key=lambda cell: f_score[cell])

This version of the algorithm tended to explore in a directed fashion toward the goal (almost like greedy best-first search), expanding relatively few nodes.

However, when I switched to using a heapq (i.e., a priority queue) for the open set, the behavior of the algorithm changed noticeably. Instead of quickly homing in on the goal, it now expands nodes in a broad, cone-like pattern that resembles breadth-first search, and it ends up exploring many more nodes.

My questions are:

  • What are the common pitfalls or differences between using a set with min() and a heap-based priority queue for the A* open set?
  • How might issues like duplicate/outdated entries and tie-breaking in the heapq affect the search order?
  • Is there a recommended strategy to preserve the directed search behavior of A* when switching to a heapq in Python?

Any insights into why the switch to a heapq might cause these drastic behavioral changes and how to mitigate them would be greatly appreciated.

I updated the code to push tuples like (f_score[node], count, node) into the heap and used a counter as a tie-breaker. I also maintain a closed set to filter out outdated entries when popping from the heap. Despite that, the exploration order seems significantly different.

I'm implementing an A* search algorithm for a maze solver in Python. Initially, I maintained the open set as a plain set and selected the node with the lowest f-score using:

current = min(open_set, key=lambda cell: f_score[cell])

This version of the algorithm tended to explore in a directed fashion toward the goal (almost like greedy best-first search), expanding relatively few nodes.

However, when I switched to using a heapq (i.e., a priority queue) for the open set, the behavior of the algorithm changed noticeably. Instead of quickly homing in on the goal, it now expands nodes in a broad, cone-like pattern that resembles breadth-first search, and it ends up exploring many more nodes.

My questions are:

  • What are the common pitfalls or differences between using a set with min() and a heap-based priority queue for the A* open set?
  • How might issues like duplicate/outdated entries and tie-breaking in the heapq affect the search order?
  • Is there a recommended strategy to preserve the directed search behavior of A* when switching to a heapq in Python?

Any insights into why the switch to a heapq might cause these drastic behavioral changes and how to mitigate them would be greatly appreciated.

I updated the code to push tuples like (f_score[node], count, node) into the heap and used a counter as a tie-breaker. I also maintain a closed set to filter out outdated entries when popping from the heap. Despite that, the exploration order seems significantly different.

Share Improve this question edited Feb 11 at 18:32 Tao asked Feb 11 at 17:58 TaoTao 111 bronze badge 3
  • Duplicates maybe? Any tiebreakers? Sidenote: min(open_set) is O(N), I don't think you want this. Also orders are not guaranteed in sets, just in case. – aaa Commented Feb 11 at 18:08
  • Try using a descending counter for tiebreaking with heapq. – user2357112 Commented Feb 11 at 18:10
  • Thanks for the feedback! I’m currently using a closed set to filter out outdated entries (duplicates) when popping from the heap. For tie-breaking, I’m currently using an increasing counter in my tuple (f_score, count, node). As suggested I just tried the descending counter by changing (f_score, count, node) into (f_score, -count, node) but that didn't fix the issue (starting from the top-left it went straight to the right wall...) – Tao Commented Feb 11 at 18:25
Add a comment  | 

2 Answers 2

Reset to default 1

Your heapq-based implementation uses an ascending counter for tiebreaking. This means that when there's a tie, the entry added to the heap earliest wins. When there are a lot of equally-promising candidates, this tends to explore all of them "together".

Your set-based implementation's tiebreaking strategy is kind of just "pray". Whatever tied entry it sees first is the one that gets picked, but the order in which it sees entries is just whatever order they happened to land in the hash table.

This produces a bias that depends on the trailing bits of hashes, how big the hash table is, and what order the entries get added to the hash table. But the important thing is that if an entry lands in the back of the hash table, it'll probably stay in the back of the hash table until it gets removed. All tied entries that land further toward the front will get picked first. Hash table rebuilds can change this, if a collision or a resize causes the entry to end up in a new bucket post-rebuild, but entries will usually stay in place.

Effectively, this produces something kind of similar to prioritizing recent entries in case of a tie. If a tied entry has been in the hash table for a while without getting picked, it's probably toward the back of the table, so new entries that tie with it will probably land somewhere earlier in the table. It's not entirely the same as prioritizing recent entries, but there's a bias in that direction.


It sounds like in the test you ran, the algorithm happened to start off in a good direction by blind luck, and the recency bias effect kept it mostly exploring in that direction. But you got lucky. There's no guarantee the bias will push the search toward a good tied option.

When you tried an actual "most recent first" tiebreaking strategy, a search starting in the top left of the map ran straight toward the right wall. Apparently you didn't like that. But if there was a clear path straight to the right wall, and the goal was in the bottom right, then running straight to the right wall is entirely consistent with trying to beeline straight for the goal. It just didn't get lucky.

The A* algorithm requires a way to update a node's score (when you find a better path to it).

In your implementation using the min function this isn't an issue, because your key function looks at the updated score.

While you haven't shared your code, it's likely that the heap-based implementation doesn't update the priority of elements in the heap and thus retrieves the wrong element (It looks at the score at the time of insertion instead of the current one). Python's heapq doesn't have a method of updating an element's priority, because there's no good way to search for an existing element's position.

Normally, updating priorities in a heap requires either a hash table to map elements to their position in the heap, or using a specialized data structure such as a Fibonacci heap.

本文标签: