One of the most important algorithms of the modern era is
graph distribution. At some level, the reason companies like UPS and Amazon are
successful boil down to how well they can solve distribution problems. Sure,
the problem and algorithm we are about to discuss is a much simpler formulation
than companies work with, but the basic principles are the same.

The graph distribution problem is simple. Given a graph (in our case, a tree), which node should you place your distribution center? And how long would it take to distribute something to the entire network? (Note: our algorithm can extend easily to graphs if you first find a minimum spanning tree).

Well for our purposes, let’s assume each node in our tree can do one of two things each timestep. It can either receive a package from one of its neighbors, or it can send packages to a group of its neighbors. Imagine each node is a factory. Each day it can either receive a large shipment of goods, or it can send out a fleet of trucks.

The following animation shows this process with a randomly selected node as the distribution center. Notice how it takes seven timestamps to reach the whole tree.

So how would we go about finding the best node to pick for our distribution center? From first glance, we probably want to pick one of those three nodes in the center, but we need to do so algorithmically. Maybe we could use network metrics to help us, and they would on average do quite well, but we can do better with the help of a pretty simple data structure.

It is no accident that we spent the last couple of sections reviewing priority queues. We are going to store our nodes in one of these in order to properly walk through the tree, starting with the leaves, and making our way to the distribution center. By effectively using our data structure, we can make this algorithm linear!

Pseudocode:

The following animation shows this process in detail.

And that’s it! We can walk through our tree once, and the last node marked is our distribution center!