Overview
I’ve recently learned about the most popular pathfinding algorithms for a course project and I thought it will be a good idea to transfer my notes onto an article to share the findings I gathered after playing around with the implementations of those.
While there are lots of articles on search algorithms that explain how they work in detail, I’ll try to encapsulate their basic characteristics and present how they compare to each other and identify their relative differences. I’ll also try to keep it simple and describe the concepts in my own words.
Pathfinding Algorithms
Pathfinding algorithms, as the name states, are used to find a way from point A to point B on a weighted graph. The way they operate is relatively similar to each other – they’re based on a recursive pattern of nodes (or vertices) where the first node is a starting point A, the ability to identify their adjacent nodes and a frontier (or a queue, fringe), which stores the nodes to be processed next. The main difference between the algorithms is the order in which the nodes are taken off the frontier to get processed, but it has a significant consequence on how the algorithm behaves.
Depending on the algorithm used, they can find any path or the shortest path, but a bit more on that later. The algorithms are also divided into informed and uninformed, which is based on the volume of information available to the agent (which is just a name for the entity searching – ex. a computer) before searching for the path.
Uninformed
Uninformed algorithms don’t know anything about the destination coordinates or distance, nor have other domain knowledge other than how to traverse the graph. You can say they’re blind and only know the starting point outside of the rules of moving around, which makes it harder for the agent to find the optimised path. They can only operate using brute force way. This type of algorithm is often used, for example, to find the way out of the maze.
Depth-First Search
A type of uninformed algorithm, which explores the deepest possible node and if doesn’t find the solution (I.e. a dead-end, not the destination point B), it “pulls back” up the node tree to the nearest decision point and explores other possibilities. It’s hard to predict how efficient it is for all the problems, as it can get “lucky” and find the shortest path the fastest.
The problem with depth-first search is that it will not always find the shortest path to the destination. I will settle on the first correct path it finds, which is not always optimal. If the path it finds relatively fast is the shortest one – great, but it could possibly be one of the last possible paths to explore on the graph, in which case all the nodes had to be processed to find it.
The difference in implementation between depth-first (DF) search and breadth-first (BF) search is trivial, but it has a great effect on how the agent finds the route. DF search uses a Stack (last in, first out) to pull the nodes from the frontier, whereas BF search uses a Queue (first in, first out).
Characteristics
- Not always finds the most optimal (the shortest) path to the destination
- Explores the deepest possible node, makes “random” decisions in decision points.
- Uses Stack data type to remove nodes from the frontier. In other words, takes the last added node.
- Well performing if manages to find the path quickly, but it’s pure luck if it does.
- Big O notation: O(V+E)
- Brute force algorithm (also called a blind search)
- Uses frontier + nodes implementation
Breadth-First Search
Breadth-first search is another brute force algorithm. It explores every possible node adjacent to the current node and does so recursively. This way, it will always make sure to find the most optimal (the shortest) path to the destination, unlike depth-first search.
Exploring every possible node until the agent finds the shortest path could be problematic from the performance point of view, but this is the consequence of a blind search.
Characteristics
- Always finds the most optimal (the shortest) path to the destination
- Explores all possible nodes concurrently. Makes no decision on decision points (since it chooses all paths at the same time)
- Uses Queue data type to remove nodes from the frontier. In other words, takes the first added node.
- Can cause performance issues if there are many possible routes to take.
- Big O notation: O(V+E)
- Brute force algorithm (also called a blind search)
- Uses frontier + nodes implementation
Informed
Informed algorithms, also known as Heuristic search have some domain knowledge (or information of the goal state). For example, the destination coordinates, which means you can calculate the distance between point A and point B in a straight line, or follow the known rules but ignoring the obstacles (otherwise known as Manhattan distance). Those are superior to uninformed algorithms because they allow making some calculated decisions when choosing paths to explore.
Greedy Best-First Search
It’s a type of informed search algorithm, which uses something called a heuristic function h(n) to determine which path to take when at the decision point. This function determines which path will be relatively closer to destination B – the estimated cost from the current location to the goal. The performance of the greedy best-first depends on how good the heuristic function is. For example, we can determine which path is better based on the Manhattan distance between the possible nodes and the end of the maze.
While this algorithm differs greatly from uninformed algorithms in terms of premises available to the agent, the implementation is strikingly similar and, again, the only implementation difference is at the decision points – which will use the heuristic function to determine where to go, so which node to process first, or in other words, which node to remove from the frontier.
Characteristics
- Not always finds the most optimal (the shortest) path to the destination
- Explores the node which is likely the closest to the destination, based on the heuristic function.
- Uses heuristic function h(n) to determine which node to process next.
- Well performing if manages to find the path quickly, but the performance heavily depends on the implementation of the heuristic function.
- Big O notation: O(b^m) where b is the amount of nodes and m is the maximum depth of the search space.
- Brute force algorithm (also called a blind search)
- Uses frontier + nodes implementation
A* (A-Star) Search
A-Star is one of the most well-known pathfinding algorithms. It’s based on the Greedy best-first approach, but in addition to the heuristic function, it also uses a function g(n), which tracks the cost of the current computation. This way, it can “back off” to an earlier node when there’s a possibility of a cheaper/shorter path to be explored. It then guarantees the shortest possible path, given that both the heuristic function and the cost function are optimal.
Characteristics
- Always finds the most optimal (the shortest) path to the destination
- Explores the node which is likely the closest to the destination, based on the heuristic function.
- Uses heuristic function h(n) to determine which node to process next.
- Well performing if manages to find the path quickly, but the performance heavily depends on the implementation of the heuristic function.
- Big O notation: O(b^d)
- Brute force algorithm (also called a blind search)
- Uses frontier + nodes implementation
Conclusion
In this article, we explored pathfinding algorithms and their characteristics. Hopefully, it will help you find the right one for your use case, or at least broaden your horizons, so you can appreciate the inner workings of tools like Google Maps.
There are many more available algorithms out there, so I highly encourage you to explore them!