Min Cost Flows: Kruskal, MooreBellmann, Floyd, Dijkstra Algorithms

JavaTools provides several network flow optimization functions from the realm of combinatorial optimization. At a glance:

◆ JDijkstra[] to compute the set of all shortest paths from a source node to all other nodes.

◆ JFloyd[] to compute the optimal all-pairs shortest path solution (arbitrary arc costs, including negative)

◆ JMooreBellmann[] to compute all shortest paths from a source node to all other nodes (arbitrary arc costs, including negative)

◆ JKruskal[] to compute the minimum spanning tree of a connected undirected network

All of these methods by far outperform their Mathematica equivalents, to the point of being so much faster that they allow for the solution of problem sizes that the built-in functions cannot cope with.

JDijkstra[]

The function JDijkstra[] computes the set of all shortest paths from a source node s to all other nodes in a cycle-free directed network with non-negative arc costs. JDijkstra[] uses the collections from the Java collections framework as well as the Google Collections (now part of Google Guava) internally as sets of nodes and arcs and their dependencies can be easily represented through these data structures.

Although the Dijkstra algorithm is not the fastest algorithm to determine all shortest paths from a source node to all other nodes in the network, the JavaTools implementation with the above-mentioned collections is MUCH faster than the Dijkstra algorithm implementation in the Mathematica Combinatorica package, as the latter is entirely written with symbolic Mathematica code and doesn't execute in a compiled environment like the Java virtual machine. In particular on large problems with hundreds of nodes and arcs the performance difference is substantial. In fact, the large examples at the end of this page couldn't be solved with Mathematica's Dijkstra implementation at all.

The inputs to the function JDijkstra[] are a sparse array defining the arc costs for every node pair having an arc, and an integer indicating which node is the source node. Spare arrays are particularly memory-efficient on large graphs where most node pairs do not have an arc.

The following example uses the Dijkstra algorithm on a very simple problem with 6 nodes and 6 arcs:

This computes all shortest paths from source node 1:

This is to be interpreted as follows: The first list is a list of the predecessor nodes. The second list is a list of the shortest path distances from the source node to every other node. The predecessor lists tells us that in the optimal solution the predecessor of node 1 (the source node) is Null (meaning, it has none), the predecssor of node 2 is node 1 (the source node), the predecssor of node 3 is node 1, the predecssor of node 4 is node 2, asf. The second list tells us that in the optimal solution the shortest path distance for node 1 is 0, for node 2 is 1, for node 3 is 2, for node 4 is 4, asf.

Note how fast the optimal solution is computed with JDijkstra[] from the JavaTools:

The arc costs matrix, in which the cost for the arc (i,j) is shown in the element in row i and column j, looks as follows (0 means the nodes have no connecting arc):

We can see that the arc between nodes 1 and 2 is included in the shortest paths for nodes 2 and 4. Therefore, if we increase its value (=costs), that will be reflected in the costs of the optimal solution, and we can also see that due to its high cost, arc (1,2) is NOT included in the optimal solution for node 6. While the cost for the latter was 7.4 previously, going "over" arc (1,2), it is now 9.3, going "over" arcs (1,3 ), (3,5), and (5,6).

Here is a slightly more complicated example:

JDijkstra[] is very fast. It finishes in less than 2 milliseconds.

The following example reads a file from the MatrixMart example data, uses Abs to ensure that all arc costs are positive, and subtracts the main diagonal to ensure that no node has an arc to itself. Then we compute all shortest paths with node 1 taken to be the source node. JDijkstra[] finiahes in 6 seconds. Dijkstra{} in the Mathematica Combinatorica package can not compute this problem at all due to its size.

The runtime complexity of the Dijkstra algorithm is O().

For sparse networks with only non-negative arc costs the D'Esopo Pape variant of the Moore Bellmann algorithm is more efficient (O(min(nmC,m))), thus possibily linear) than the Dijkstra algorithm, see JMooreBellmann[] below. While JDijkstra[] computed the above result in 6 seconds, JMooreBellmann[] computes it in 78 milliseconds due to the extremely efficient double-edged queue used internally.

JFloyd[]

The JavaTools function JFloyd[] computes the optimal all-pairs shortest path solution, permitting negative arc costs as well. "all-pairs" means that unlike the Dijkstra algorithm the Floyd algorithm computes the shortest paths to all nodes for ALL nodes as source nodes. The output is the so-called "shortest paths" matrix, which contains the optimal nodes for every row (where row index n represents node n as source node), where Infinity indicates that there is no path from that source node to the target node (column index), as well as the matrix with the corresponding distances. Using the example data from above we get:

JFloyd[] finishes in less than 2 milliseconds:

Note that the last example for JDijkstra[] is just a special case of the output of the Floyd algorithm (first row).

Using the MatrixMart example from above, JFloyd[] computes the all-pairs optimal solution, and we can compare the first row with the output from the Dijkstra algorithm, which must be identical:

The values attained by the two solutions have to be the same (not necessarily the solutions themselves). Three cases contain Infinity (which means no directed path is possible between the two nodes), and where one solution has Infinity, so does the other. In all other cases the difference between the two values is 0:

The runtime is less than 9 seconds:

The runtime complexity of the Floyd algorithm is O(), and its data storage requirements are O(). It is therefore much more efficient to use the MooreBellmann or Dijkstra algorithms over the Floyd algorithm if only the optimal paths for one source node have to be computed.

The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.net to challenge the presented performance claim.

JMooreBellmann[]

The JavaTools function JMooreBellmann[] finds the shortest paths from a source node to all other nodes for arbitrary arc costs, using the D'Esopo Pape variant, which is the most efficient method known today for practical problems (its theoretical worse-case complexity is exponential). The D'Esopo Pape variant is also sometimes known as the "Deque" variant ("double-edged queue"), as a deque is used internally for updating nodes. Unlike the Dijkstra algorithm, the Moore-Bellmann algorithm works correctly even for arbitrary (negative) arc costs. However, for the Moore-Bellmann to work correctly, it is imperative that the network does not contain any *circles* of negative arc length (as it would be possible to "walk" in the circle an infinite number of times and always decrease the total arc length).

Using the example from above:

This is the same output as was computed with the Dijkstra algorithm, and it matches the first line of the output of the Floyd algorithm.

Its running time is very fast: less than 2 millisconds for this example:

In the following example we use a few negative arc costs, not introducing a negative cycle:

This is the same output as was computed with the Dijkstra algorithm, and it matches the first line of the output of the Floyd algorithm.

For dense networks with only non-negative arc costs the Dijkstra algorithm is more efficient (O()), for sparse networks with only non-negative arc costs the D'Esopo Pape variant of the Moore Bellmann algorithm (implemented in JMooreBellmann[]) is more efficient (O(min(nmC,m)), thus possibily linear).

Using the MatrixMart example from above, JMooreBellmann[] computes the result in 78 milliseconds.

Here as well we get different solutions, but identical values of the solution:

However, the solutions themselves returned by the two algorithms are NOT necessarily the same (the solutions of the shortest path problem, and of the minimum spanning tree, see below, are not unique). In this example, out of the 1074 paths, 1038 are the same, 36 are different:

This means that in 36 cases different paths were chosen, however, they both attain the same optimum value.

The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.net to challenge the presented performance claim.

JKruskal[]

JavaTools also provides a very efficient implementation of the Kruskal algorithm to compute the minimum spanning tree of a connected undirected network. The "modified Kruskal algorithm" is invoked with JKruskal[] in the JavaTools, and it operates on sparse arrays or lists with the nodes and costs. Here we use the sparse array version:

This means the arcs (1,2), (2,4), (3,4), and (3,5) form the minimum cost spanning tree.

Note the speed: JKruskal[] finishes in roughly a millisecond:

The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.net to challenge the presented performance claim.

More Examples

The following example uses a slight modification of an example from the Mathematica help browser. We compute a minimum spanning tree (assuming arcs are just undirected edges and all edge=arc costs are their -- 2-dimensional -- Euclidean lenghts) using the modified Kruskal algorithm, the shortest paths from the source node 1 to all other nodes using the Dijkstra algorithm as well as the Moore-Bellmann algorithms, the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results.

Note that the Kruskal algorithm and all its variants compute EXACT optimal solutions, that means they are not heuristics to compute a solution that is not necessarily optimal (only "good"). The solution is not necessarily unique, but it is guaranteed to be an optimal solution. In other words, other solutions are possbile, but no better ones. Other solutions are only just as good. However, for "sufficiently random" inputs of type Real, such as the coordinates of cities (see below), the solutions usually are unique, hence, the optimal solution is the only optimal one. It is mathematically next to impossible that random inputs would result in problems for which the optimal solutions are not unique. For the next few graph theoretic examples other optimal solutions are readily apparent, but for the city data examples below it is impossible that other optimal solutions exist.

To compute the minimum spanning tree solution using edge lengths as edge costs we have to extract the edge lengths from the graph:

Here we compute the minimum spanning tree solution (in practically 0 time):

This is what the minimum spanning tree solutions looks like, overlaid on the whole graph:

The shortest paths from the source node 1 to all others, using the Dijkstra algorithm:

The shortest paths from the source node 1 to all others, using the D'Esopo Pape variant of the Moore-Bellmann algorithm:

The all-pairs shortest paths solution using the Floyd algorithm (again, note that the first line is identical to the Dijkstra and MooreBellman solutions):

The following example uses the Petersen graph as a starting point, but with double-edges removed:

Another example (a modification of an example from the Mathematica help browser):

More Large Examples

In the following section we show two more large-scale examples. None of the problems could be solved with the functions that Mathematica has built in.

Both examples use an underlying graph that is not planar, i. e. there are edges that are crossing other edges. This makes the problem particularly difficult, and in the solution plots you can see that optimal edges are hidden underneath other edges that were not included in the optimal solution.

The following is also an example from the Mathematica help browser. As in the previous examples we compute the minimum spanning tree with the modified Kruskal algorithm (again assuming all edge costs to be their 2-dimensional Euclidean lengths), and the shortest paths from the source node 1 to all other nodes with the Dijkstra and Moore-Bellmann algorithms, as well as the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results and solution overlays.

JavaTools computes the minimum spanning tree solution in a mere 157 milliseconds:

The following is also an example from the Mathematica help browser. The matrix is related to a structure from NASA's Langley Research Center. As in the previous examples we compute the minimum spanning tree with the modified Kruskal algorithm (again assuming all edge costs to be their 2-dimensional Euclidean lengths), and the shortest paths from the source node 1 to all other nodes with the Dijkstra and Moore-Bellmann algorithms, as well as the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results and overlays.

JavaTools computes the minimum spanning tree solution in 2/3 of a second:

The last Dijkstra and MooreBellmann solutions show just how inferior the Dijkstra algorithm is and how superior the MooreBellmann algorithm is, which can also be shown theoretically.

The author of JVMTools is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.net to challenge the presented performance claim.