Graph searching: Breadth-first vs. depth-first

69

44

When searching graphs, there are two easy algorithms: breadth-first and depth-first (Usually done by adding all adjactent graph nodes to a queue (breadth-first) or stack (depth-first)).

Now, are there any advantages of one over another?

The ones I could think of:

  • If you expect your data to be pretty far down inside the graph, depth-first might find it earlier, as you are going down into the deeper parts of the graph very fast.
  • Conversely, if you expect your data to be pretty far up in the graph, breadth-first might give the result earlier.

Is there anything I have missed or does it mostly come down to personal preference?

malexmave

Posted 2012-03-13T10:05:58.093

Reputation: 460

Answers

38

I'd like to quote an answer from Stack Overflow by hstoerr which covers the problem nicely:

That heavily depends on the structure of the search tree and the number and location of solutions.
If you know a solution is not far from the root of the tree, a breadth first search (BFS) might be better. If the tree is very deep and solutions are rare, depth first search (DFS) might rootle around forever, but BFS could be faster. If the tree is very wide, a BFS might need too much more memory, so it might be completely impractical. If solutions are frequent but located deep in the tree, BFS could be impractical. If the search tree is very deep you will need to restrict the search depth for depth first search (DFS), anyway (for example with iterative deepening).

But these are just rules of thumb; you'll probably need to experiment.

Rafał Dowgird also remarks:

Some algorithms depend on particular properties of DFS (or BFS) to work. For example the Hopcroft and Tarjan algorithm for finding 2-connected components takes advantage of the fact that each already visited node encountered by DFS is on the path from root to the currently explored node.

Gigili

Posted 2012-03-13T10:05:58.093

Reputation: 1 333

4I cannot understand why this answer has 27 upvotes and it is exactly the merging of 2 other answers, which by the way are simply general thoughts about... – nbro – 2015-05-31T15:06:33.863

33

One point that's important in our multicore world: BFS is much more easy to parallelize. This is intuitively reasonable (send off threads for each child) and can be proven to be so as well. So if you have a scenario where you can make use of parallelism, then BFS is the way to go.

Suresh

Posted 2012-03-13T10:05:58.093

Reputation: 4 051

5If DFS is otherwise advantageous in the given setting, you can apply BFS until you have spawned enough threads and continue with DFS. More specifically, you can do DFS and whenever you descend and there are not enough threads, spawn one for the next sibling. – Raphael – 2012-03-13T18:16:03.377

This answer does not deserve 20 upvotes. The question is about the general use of the 2 algorithms and not about a particular use. – nbro – 2015-05-31T15:10:30.650

25

(I made this a community wiki. Please feel free to edit.)

If

  • $b$ is the branching factor
  • $d$ is the depth where the solution is
  • $h$ is the height of the tree (so, $d\le h$)

Then

  • DFS takes $O(b^h)$ time and $O(h)$ space
  • BFS takes $O(b^d)$ time and $O(b^d)$ space
  • IDDFS takes $O(b^d)$ time and $O(d)$ space

Reasons to choose

  • DFS
    • must see whole tree anyway
    • you know $d$, the level of the answer
    • don't care if the answer is closest to root
  • BFS
    • answer is close to the root
    • you want the answer that is closest to the root
    • have multiple cores/processors
  • IDDFS
    • you want BFS, don't have enough memory, but somewhat slower is acceptable

IDDFS = iterative deepening depth-first search

rgrig

Posted 2012-03-13T10:05:58.093

Reputation: 1 006

1This is an excellent answer. I notice though that while the question asks about graphs, this answer refers to trees. A tree is a graph of course, and it may be the word can be replaced...but how about h, the "height of the tree". Does that translate directly to the "height of the graph"? – user2023370 – 2016-04-26T09:26:15.293

Another reason to use IDDFS is that, depending on how you want to use it, after each iteration you can have a possible answer (if you're searching for, say, a maximum or a minimum). This means that you can quit the algorithm early if your answer is "good enough" or you can quit on user input (like, a chess engine using IDDFS to find an optimal solution but being interrupted by a player moving a piece). – jedd.ahyoung – 2017-11-11T19:08:58.770

One other point that be added is that the DFS uses stack whereas the BFS uses queue. – Karthik Balaguru – 2017-11-14T21:16:44.360

16

One scenario (other than finding the shortest path, which has already been mentioned) where you may have to choose one over the other to get a correct program would be infinite graphs:

If we consider for example a tree where each node has a finite number of children, but the height of the tree is infinite, DFS might never find the node you're looking for - it will just keep visiting the first child of every node it sees, so if the one you're looking for isn't the first child of its parent, it will never get there. BFS however is guaranteed to find it in finite time.

Similarly if we consider a tree where each node has an infinite number of children, but the tree has a finite height, BFS might not terminate. It will only visit the children of the root node and if the node you're looking for is the child of some other node, it won't be reached. In this case DFS is guaranteed to find it in finite time.

sepp2k

Posted 2012-03-13T10:05:58.093

Reputation: 1 448

5

It is noteworthy that both yield only semi-decision algorithms for infinite graphs; you can not decide in finite time that an element is not in the tree (obviously). As for practical applications, note that (conceptually) infinite data structures can be defined (see paragraph 3.4)!

– Raphael – 2012-03-13T13:42:22.690

14

Breadth-first and depth-first certainly have the same worst-case behaviour (the desired node is the last one found). I suspect this is also true for averave-case if you don't have information about your graphs.

One nice bonus of breadth-first search is that it finds shortest paths (in the sense of fewest edges) which may or may not be of interest.

If your average node rank (number of neighbours) is high relative to the number of nodes (i.e. the graph is dense), breadth-first will have huge queues while depth-first will have small stacks. In sparse graphs, the situation is reversed. Therefore, if memory is a limiting factor the shape of the graph at hand may have to inform your choice of search strategy.

Raphael

Posted 2012-03-13T10:05:58.093

Reputation: 54 413

The length of the queue in bfs and the height of the stack in dfs depends very much on the implementation. If in case of dfs you always expand the whole neighborhoud on the stack then it grows a lot, especially when the graph is dense. Pushing only reference that tells where to continue when dfs returns from the recursion saves a lot of space. – uli – 2012-03-13T16:36:30.583

3

All of the above is correct, but it's noteworthy that BFS and DFS create their own trees, based on the order they use to traverse the tree. Each of those trees has it's own property which can be useful in some sort of problems.

For example, all the edges in the original graph which are not in the BFS tree are cross edges; edges which are between two branches of the BFS tree. All the edges in the original graph which are not in the DFS tree are back edges; edges which connect two vertices in a branch of the DFS tree. Such properties can be useful for problems such as special colorings, etc.

MMS

Posted 2012-03-13T10:05:58.093

Reputation: 347

1

DFS and BFS tree both have their own unique properties that can give you more useful information about the graph. For example with a single DFS you can do the following:

  • Find bridges and articulation points (for undirected graphs)
  • Cycle detection
  • Find strongly connected components (Tarjan's algorithm)

With BFS, you can find the shortest paths between the source node and any other nodes in the graph.

The Graph Algorithms chapter in CLRS sums up these properties of DFS and BFS very nicely.

Pinch

Posted 2012-03-13T10:05:58.093

Reputation: 109

-2

I think it would be interesting to write both of them in a way that only by switching some lines of code would give you one algorithm or the other, so that you will see that your dillema is not so strong as it seems to be at first.

I personally like the interpretation of BFS as flooding a landscape: the low altitude areas will be flooded first, and only then the high altitude areas would follow. If you imagine the landscape altitudes as isolines as we see in geography books, its easy to see that BFS fills all area under the same isoline at the same time, just as this would be with physics. Thus, interpreting altitudes as distance or scaled cost gives a pretty intuitive idea of the algorithm.

With this in mind, you can easily adapt the idea behind breadth first search to find the minimum spanning tree easily, shortest path, and also many other minimization algorithms.

I didnt see any intuitive interpretation of DFS yet (only the standard one about the maze, but it isnt as powerful as the BFS one and flooding), so for me it seems that BFS seems to correlate better with physical phenomena as described above, while DFS correlates better with choices dillema on rational systems (ie people or computers deciding which move to make on a chess game or going out of a maze).

So, for me the difference between lies on which natural phenomenon best matches their propagation model (transversing) in real life.

user9589

Posted 2012-03-13T10:05:58.093

Reputation: 97

1Welcome to the site! However, I don't really see how this answers the question. It seems to be your general feelings and intuitions about BFS and DFS but the question isn't asking about feelings and intuitions: it's asking about advantages and disadvantages. Your answer doesn't seem to address that at all. – David Richerby – 2016-08-03T00:11:20.377

The part most linked to the question is about adapting the algorithm to find minimum spanning trees, shortest path and so on, and to say that some natural phenomena are reproduceable by BFS, while decision trees by DFS – user9589 – 2016-08-03T07:45:04.480

1The question isn't asking what's related to BFS and DFS. It's not asking about finding spanning trees or shortest paths or how to "reproduce natural phenomena". – David Richerby – 2016-08-03T08:05:58.750

Its asking for advantages. If one can model a physical phenomena while the.other cannot, its an advantage if you need to model this phenomena. I think you are sticking to the standard concepts of algorithms textbook when interpreting the word 'advantage', while I am not – user9589 – 2016-08-03T08:47:14.820