You are deep inside a dark mine tunnel, working hard to extract some gold. Then one of your pickaxe blows breaks open a pocket of carbon monoxide gas. Your gas detectors start blaring, and you realize you have very little time to escape before the gas fills the tunnel and leaves you without air. How do you find the fastest way out before it is too late?
Click the game and walk to the exit using the arrow keys. When you escape the game will tell you whether you actually found the shortest route.
This note is part of the Pathfinding in the Gold Mine series.
Setup From the Previous Notes
Every visual on this page builds on the same code we wrote in
Representing a Gold Mine: the MINE_MAP, the START / END markers, and
the neighbors() generator. The block below shows it in full, and you can edit
it freely. Replacing the map, swapping the order of MOVES, or rewriting
neighbors() is reflected immediately in the algorithms further down and in the
game above.
Two Strategies
There are many ways to explore a maze, but two of them show up everywhere:
- Depth-first search (DFS): pick a direction, follow it as deep as you can, and only backtrack when you hit a dead end. DFS is what happens naturally when you write a recursive function that calls itself on each neighbor.
- Breadth-first search (BFS): explore the mine in concentric rings. First all cells one step away from the start, then all cells two steps away, then three, and so on, until you find the exit.
DFS is one curious explorer that never hesitates; BFS is a wave of water spreading out from the start in every direction at the same speed.
Which one do you think will find the shortest path out of the mine?
DFS: The First Path We Find
Let’s start with DFS because it is the most natural to write. We walk into a cell, mark it as visited, recurse on each unvisited neighbor, and stop the moment we reach the exit.
This dfs function returns as soon as it finds any path from start to exit.
It does not care whether that path is short, it just commits to the first one it
stumbles into.
Press Play to watch the explorer follow the corridors. The yellow/orange line
is the current path it is holding on its path list; the highlighted cells are
everything it has visited along the way.
Look closely at the final orange path. Is it the shortest? Probably not. DFS happily follows long, winding corridors before considering shorter alternatives. It only knows “go deeper, backtrack on dead ends”.
Experiment TipsChange the order of
MOVESin theneighbors()function. How does that affect the path DFS finds? In general, it will bias the search toward certain directions, and that can lead to much longer or shorter paths depending on the layout of the mine. Can you find the order ofMOVESthat finds the shortest path?
DFS Without Recursion
The recursive version is the most natural one to write, but recursion is not what makes DFS work. The real engine is a stack.
Every recursive call pushes a new frame onto Python’s call stack, and every
return pops one back off. So in a very literal sense, recursive DFS is already a
stack-based algorithm. An iterative DFS just makes that stack explicit and
manages it with a while loop.
Below is the same “first path found” strategy written iteratively. Instead of asking Python to remember where to come back to, we keep our own stack of frames.
We keep a parent dictionary that remembers, for each cell, which neighbor
discovered it. Once we hit the exit we walk that dictionary backwards to
reconstruct the route.
Removing recursion does not change the behavior, but it is usually more efficient because we avoid the overhead of function calls. It also removes the risk of hitting Python’s recursion limit on large maps.
We are calling the
reconstructfunction on every iteration of the loop because it helps the replay show the path being explored at each step. In a more optimized implementation, we would only reconstruct the path once we find the exit, and that would save some time.
Experiment Tips
- This iterative version finds a different path than the recursive one. Can you figure out why? Hint: it has to do with the order in which neighbors are added to the stack vs the order in which they are explored. Can you modfiy the code above to make it find the same path as the recursive version?
- The recursive version marks a cell as visited when it enters that cell. The iterative version marks a cell as visited before pushing it onto the stack. This prevents the same cell from being pushed several times through different paths. If we do not mind having duplicate cells on the stack, we could instead mark a cell as visited only when we pop it from the stack. That would make the iterative version behave more like the recursive one: during the replay,
visitedwould contain only the cells that have actually been expanded, meaning the cells whose neighbors have been inspected.- We can also construct the path incrementally as we go, without the need of a
reconstructfunction and theparentdictionary, but that would require more bookkeeping and make the code less clear. Feel free to try it if you want!
BFS: The Shortest Path, Guaranteed
To find the shortest path we need to explore cells in order of distance from the start, not in order of how recently we discovered them.
That is exactly what swapping the stack for a queue achieves. Each iteration of BFS pops the oldest unexplored cell and expands its neighbors. As a consequence, every cell is reached for the first time via a shortest path from the start.
Similarly to iterative DFS, we keep a parent dictionary that remembers, for
each cell, which neighbor discovered it. Once we hit the exit we walk that
dictionary backwards to reconstruct the route.
Things to notice in the replay:
- The visited region grows as a roughly circular wave around the start. That is the BFS “expanding ring” in action.
- The current cell jumps around a lot, because the queue keeps mixing cells from different corridors. DFS, in contrast, marches in straight lines and only jumps when it backtracks.
- The final orange path is the shortest possible route between
SandE. No other path on this map can have fewer steps. However, depending on the map, there may be multiple shortest paths with the same length.
We are calling the
reconstructfunction on every iteration of the loop because it helps the replay show the path being explored at each step. In a more optimized implementation, we would only reconstruct the path once we find the exit, and that would save some time.
Why BFS Wins Here
Both algorithms eventually visit the same cells in the worst case, so they have the same big-O complexity: where and are the number of rows and columns of the grid. The difference is in what each algorithm guarantees when it stops:
- DFS guarantees a path, if one exists.
- BFS guarantees the shortest path, if one exists.
If you only need some path (for example, just to check whether the exit is reachable), DFS is great and, in general, uses very little memory. If you need the shortest, you almost always want BFS.
There is also a more subtle point. BFS keeps a queue of “frontier” cells, which can grow large in open areas. DFS keeps only the current path on its recursion stack, which is at most deep but is usually much shorter. So BFS trades a bit of memory for the shortest-path guarantee.
Code Similarity
Just for fun, look at the code for the iterative implementation of DFS and BFS above. They are almost identical. The only difference is:
- The queue vs stack data structure. Swap the stack for a queue and DFS becomes
BFS.
- Stack: last-in, first-out. The most recently discovered cell is explored next.
- Queue: first-in, first-out. The oldest discovered cell is explored next.
What Comes Next
In Escaping the Monster in the Mine we add a moving threat that turns the shortest-path problem into something quite different: now the route also has to avoid being intercepted. After that, in Escaping with the Most Gold, we flip the question on its head and look for the longest path the miner can take before escaping.
Bytes