Finding the critical path
Soura Bhattacharyya | April 10, 2023
Last updated
Soura Bhattacharyya | April 10, 2023
Last updated
One of the key analytical measures of a project is its critical path, or CP—the longest path from start to finish. Tasks on the CP have to be closely monitored; delays in the CP delay the entire project. The CP itself has to be monitored—complex projects have dynamic CPs.
We have used several off-the-shelf project management tools, both with and without CP calculation. Those that did have it, did not perform to our satisfaction. We were left wondering—what's under the hood?
Recently, we developed an in-house project management platform, Feeta. One of its features is to calculate CP. It performed to our satisfaction, and handled tricky real-world scenarios with aplomb.
At its heart lies the Bellman-Ford (BF) algorithm.
This post presents our understanding of the BF algorithm. If you want a more thorough analytical grounding, please check out the resources mentioned at the end this post. That is where we started.
This post introduces basic terms used in graph theory, and works through an example step-by-step, so that the iterative nature of the BF algorithm can be more clearly understood. Finally, we provide pseudocode and python code from other internet resources.
We start with a project that has tasks with durations and dependencies. There are multiple paths from start s
to finish f
. The longest path, or the critical path, is {s,k,h,i,m,n,f}
, totalling 15 days.
Graph theory can be applied to analyze such a diagram—each task is treated as a vertex, and each dependency is an edge.
Projects can be thought of weighted, directed, acyclic graphs, commonly abbreviated as weighted DAGs. Weights represent any numeric property of an edge; here, we use task duration. Directed graphs are unidirectional—we move only from predecessors to successors, from s
to a
; never the other way. And acyclic graphs do not have cycles, or loops.
To analyze projects, we assign weights equal to the task duration to each of the edges, so that path length converts to project duration. To find the CP, we need an algorithm that finds the longest path.
The Bellman-Ford algorithm searches for the shortest path in a graph. But the critical path is the longest. Therefore, we invert the signs of all weights, so that a task of weight (duration) 2
days is represented as -2
. Consequently, the path with the highest negative value is selected as the shortest: the critical path.
Before we find the longest path, we have to assign weights to edges. If we assign the duration of task v
to the edge {v,w}
that connects to the successor w
, figure 1 transforms into figure 2.
Task a
is 2 days long, and is followed by task b
. Thus, we set edge length C(a,b)
equal to the duration of task a
. We could also calculate C(a,b)
by the difference in start dates:
startDiff = startDate(a)-startDate(b)
While convenient, the start-date-difference method is flawed, because it includes any slack between a
and b
as a part of the path length. This defeats the purpose of finding the longest path.
However, startDiff is a useful parameter to optimize our calculations. Concretely,
If b is a task, then C(a,b) = min (startDiff(a,b), Duration(a))
If b is a milestone, then C(a,b) = Duration(a)
This formula accommodates cases where a task finishes on the same day as its successor. In such cases, the duration is one day, but startDiff
is zero. It also accommodates milestones that are achieved as soon as a task ends, including the project end milestone.
In the example above, only path is the critical path. It is 1 + 4 + 1 = 6 days long, including weekends/ holidays.
Holidays have been intentionally included. Once we calculate the project duration—7 calendar days—we can compare it to the critical path length of 6 calendar days, and infer that the critical path has one day of slack. As we can see, there is a day of slack between the third and fourth tasks.
Before we move to the algorithm, let us formalize notation:
n
= number of vertices. In the context of a gantt, this is the number of tasks.
m
= number of edges. In the context of a gantt, this is the number of dependencies.
s
= source vertex, f
= destination vertex (i.e. the starting and finishing tasks)
v
= any vertex
e
= any edge
L(i,v)
= minimum length of a path from s to v, where i represents the number of hops.
i ∈ {0, 1, 2, … n-1}
, because the longest possible path is one that touches all the vertices once.
C(v,w)
= path length between two vertices connected by an edge, v and w
Bellman-ford is an iterative algorithm, which examines all available next steps as it "walks" through the entire graph. It consists of two loops.
We iterate through an outer loop a maximum of n-1 times, because an acyclic path can, at most, pass through each vertex once (if it passed through the same vertex twice, it would become cyclic, i.e. it would have a loop).
A path passing through all n
vertices has n-1
segments, or hops. In our example, n=14
, hence there are a maximum of 13 outer loop iterations. However, because our graphs are sparse—most vertices have one or two inbound edges—we will need fewer that 13 iterations.
Within each outer loop, we examine all available next steps, and “take the next step” wherever we find that it is a shorter path to a vertex. “Taking the next step” consists of 2 actions—updating the distance to the vertex, and adding the new vertex to the array that stores the path.
Hence, we execute the following inner loop:
i=0
means that there are zero segments, i.e. no hops—hence we can only get from s to s.
Therefore, L(0,s) = 0
, representing a path of zero length from the starting vertex to itself.
Also, we set L(0,v) = ∞
for all other vertices, because we cannot reach any of them in zero hops. Operationally, we can use a sufficiently large number instead of infinity, as long as it is guaranteed to always be larger than any task’s duration (in days).
Thus, the distance array for the zeroth iteration is:
Previously computed vertices in red, vertices computed in this iteration in yellow, and unchanged vertices in blue. Paths and path lengths:
Distance array:
Paths and path lengths:
Paths and path lengths:
Paths and path lengths:
Paths and path lengths:
Paths and path lengths:
Thus, the algorithm finds five paths from start to finish. The path with the greatest negative value, skhimnf
, is the longest path, or the critical path.
Here is the distance array through all the iterations.
iter
s
a
b
c
d
e
k
g
h
i
j
m
n
f
0
0
∞
∞
∞
∞
∞
∞
∞
∞
∞
∞
∞
∞
∞
1
0
0
∞
∞
∞
∞
0
∞
∞
∞
∞
∞
∞
∞
2
0
0
-2
∞
∞
∞
0
-1
-1
∞
∞
∞
∞
∞
3
0
0
-2
-5
-2
-5
0
-1
-1
-2
∞
∞
∞
∞
4
0
0
-2
-5
-7
-8
0
-1
-1
-2
-5
-5
∞
-7
5
0
0
-2
-5
-7
-8
0
-1
-1
-2
-5
-5
-12
-7
6
0
0
-2
-5
-7
-8
0
-1
-1
-2
-5
-5
-12
-15
The algorithm terminates in iteration 6 because all paths have reached the finish point.
At the end of iteration 6, we also have all paths available as a set of arrays.
Source: Wikipedia
A python implementation is available in Programiz.com; it does not print the path. You can try out this code on Replit, with the following inputs:
The output is:
I use a free Replit account, so no guarantees that the workspace will always be available.
Algorithms: Design and Analysis, Part 2 | edX > The Bellman-Ford Algorithm: This is a Stanford undergraduate course, available for free on edX.
Algorithms, Part II | Coursera > Shortest Paths: This is a Princeton undergraduate course, available for free on Coursera. It includes guidance for java implementation.