HANS BODLAENDER
Maximum Flow On an undirected graph, we can define two problems: Euler path, in which we attempt to find a path in which each edge is taken exactly once, and Euler tour, where we additionally
have to end up at the starting vertex. Theorem: A connected graph has an Euler tour, iff every vertex in the graph has even degree. In that case, the Euler tour can be found by:
1. From any starting vertex v build up any tour using unvisited edges until returning to v . It is not possible to get stuck because all vertices have even degree.
2. While there exist unvisited edges, start from any vertex u on the tour with unvisited edges, and start another trail from there until returning to u.
3. Merge the two trails. If there are still unvisited edges, repeat from step 2.
Corollary: A connected graph has an Euler path, iff the graph has either 0 or 2 vertices of odd degree. In the case of 2 vertices with odd degree, you start and stop at those vertices.
Maximum flow problem: on a directed acyclic graph G = (V, E) with source s ∈ V , sink t ∈ V , a capacity c(e) > 0 for each edge e ∈ E , and a flow f (e) < c(e) for each edge (with flow in
equalling flow out, except for s en t), find maximum flow value, which is the total flow out of s. Can be solved using Ford-Fulkerson (or others): Compute residual network Gf (V, Ef ) with capacity
cf (u, v) = c(u, v) − f (u, v) and no flow. Gf can have capacity > 0 which has capacity 0 in the original graph (flows in opposite directions cancel out). While there is a path p with every edge
having remaining capacity (cf (u, v) > 0∀(u, v) ∈ P ), find the minimum of these capacities cf (p) = min{cf (u, v) : (u, v) ∈ p} and push cf (p) extra flow through that path (in the original graph
of course): for each edge (u, v) ∈ p, f (u, v) = f (u, v) + cf (p) and f (v, u) = f (v, u) − cf (p).
s-t-cut of G(V, E) is a partitioning of all vertices, such that all vertices are either in the set S containing also s, or in T containing also t. The set of edges that connect S to T is the cut-set, and the
sum of all edge capacities in the cut-set is the capacity of the corresponding s-t-cut. Theorem: the minimum capacity over all possible s-t-cuts equals the maximum flow on that graph. Applications:
• Finding smallest cuts: Given (un)directed G, s, and t, what is the smallest number of edges to be removed such that there is no path from s to t? Give every edge a capacity of 1, run
Ford-Fulkerson and get network at full capacity, build residual network, and then (according to the internet) the minimum cut consist of all edges from vertices that can be reached from s in
that residual graph to vertices that cannot be reached from s.
• Finding small vertex separators: given two vertices in a graph, what is the smallest number of vertices that, if removed, make it impossible for the two to reach each other. Replace every vertex v
(except the two) by two vertices vi n and vout with one edge between vin and vout , run Ford-Fulkerson, find smallest cut. There is always the possibility that the algorithm suggests removing
some of the ”real” edges that were originally in the graph (going to vin or going from vout ). In that case, you can just instead remove the edge between vin and vout (the corresponding fake
edge), which would in turn be the same as removing the vertex v .
We can use solutions to this ”regular” problem to solve more complicated variations:
• Multiple sources and multiple sinks: Add one supersource and one supersink which connect to all sources and sinks respectively, and solve this new graph. We can add lower bounds and capacities
to both the edges from the supersource to the ”subsources” and edges from the ”subsinks” to the super sink in order to control their amounts and model e.g. demand.
• A graph where each edge additionally has a lower flow bound l(e):
1. Find admissible flow (aka transshipment) f , which is basically any valid flow, adhering to flow conservation and lower and upper capacity constraints:
0
(a) Add the edge (t, s), with a sufficiently large capacity, to obtain G (in which there is no source and sink anymore)
0
(b) Find admissible circulation (a flow without beginning/source or end/sink) in G :
0 0 0 0 00
i. Add new s and new t , change every edge capacity c(u, v) = c(u, v) − l(u, v), add edges and capacities c(u, t ) = c(s , v) = l(e) to obtain G (no lower bounds anymore)
00 00
ii. Compute maximum flow f on G (with any algorithm, e.g. Ford-Fulkerson)
0 0 0
iii. Iff all edges from s (and hence all edges to t ) use their full capacity, we have admissible circulation on G . Else break the whole thing off.
0 0 0 0 00
iv. Translate back to admissible circulation in G by ignoring s and t and calculating f (u, v) = f (u, v) + l(u, v) (adding the original lower bounds back)
(c) Translate back to admissible flow f in G by ignoring edge (t, s)
0
2. Compute maximum flow f from f in residual network Gf (with capacities cf (u, v) = c(u, v) − f (u, v) and cf (v, u) = f (u, v) − l(u, v) and no flow):
0
3. f + f is now the maximum flow from s to t satisfying lower and upper constraints
Matrix rounding problem can be solved by getting maximum flow on the following graph: from s there go arrows to vertices representing the rows, and all these row vertices are fully-connected with
column vertices, which are in turn all connected to t. The capacities and lower bounds of all the edges between rows and sums are the corresponding values in the matrix rounded up and down
respectively. Maximum flow will then give correct integer roundings for the row and columns sums and their edges from s and to t respectively.
Preflow push algorithm, a faster alternative to Ford-Fulkerson. Maintains preflow: flow which hasn’t reached t yet. Skew symmetry: f (u, v) = −f (v, u). The excess flow of u is e(u) = f (V, u), where
f (V, u) is the sum of all u’s incoming flow (including outgoing flow as negative flow). Vertex u has height h(u). h(s) = n and h(t) = 0 (these don’t change) and an edge cannot drop more than 1
in height: ∀(u, v) ∈ Ef : h(u) ≤ h(v) + 1. Initialize h(v) = 0 for all h 6= s and for all edges (s, u) do f (s, u) = c(s, u) and f (u, s) = −c(s, u) and e(u) = c(s, u) (s pompt zijn leidingen vol).
Then perform the following two operations until neither are possible:
• Push: if e(u) > 0, cf (u, v) > 0 and h(u) = h(v) + 1 (only push to vertices 1 lower), then volpomp: r = min{e(u), cf (u, v)}, f (u, v)+ = r , f (v, u) = −f (u, v), e(u)− = r , e(v)+ = r .
• Lift: when no push possible from overflowing vertex and e(u) > 0 and ∀(u, v) ∈ Ef : h(u) ≤ h(v), set h(u) = 1 + min{h(v) | (u, v) ∈ Ef } (raise so we can push to lowest connected vertex)
Lemmas: • If there is an overflowing vertex (6= t), then a lift or push operation is possible • The height of a vertex never decreases • When a lift operation is done, the height increases by at least
1 • h remains a height function during the algorithm • There is no path in Gf from s to t. Proof: the height drops by at most 1 across each of the at most n − 1 edges of any path • When the
algorithm terminates, the preflow is a maximum flow from s to t. Proof: f is a flow, as no vertex except t has excess. As Gf has no path from s to t, f is a maximum flow. Time analysis (it’s
O(n2 m) time): If u overflows then there is a simple path from u to s in Gf (flow must arrive from s to u: reverse of such flow gives the path). For all u: h(u) < 2n (h(s) remains n. When vertex
is lifted, it has excess, hence path to s, with at most n − 1 edges, each allowing a step in height of at most one up.) Therefore, each vertex is lifted less than 2n times. Number of lift operations is
2
less than 2n . Now count separately the number of...
• Saturating pushes (sends cf (u, v) across (u, v)): After saturating push across (u, v), edge (u, v) disappears from Gf . Before next push across (u, v), it must be created by push across (v, u).
Push across (v, u) means that a lift of v must happen. At most 2n lifts per vertex: O(n) sat. pushes across edge. O(nm) saturating pushes
P 2
• Non-saturating pushes (sends e[u] < cf (u, v)): we use Φ = e(v)>0 h(v). Initially Φ = 0. Φ increases by lifts in total at most 2n . Φ increases by saturating pushes at most by 2n per push,
2 2
in total O(n m). Φ decreases at least one by a non-saturating push across (u, v) (because after push, u does not overflow, v may overflow, but h(u) > h(v)). P At most O(n m) pushes.
Minimum Cost Flow Edges have capacity c(u, v) as before and cost cost(u, v): cost that must be paid per unit of flow that goes through the edge. Cost of flow f: (u,v)∈E f (u, v) ∗ cost(u, v).
Minimum Cost Flow problem: Given network G, c, cost, s, t, and a target flow value r , find a flow from s to t with value r , with minimum cost. Some edges may have unbounded capacities: • a negative
cost cycle can be found using shortest path algorithms (Johan) • if there isn’t one, use simple transformation to bounded capacities, e.g. each unbounded capacity is sum of all bounded capacities.
• If we have a directed cycle c, then we can see it as a circulation: c(e) = 1 when e is on c and c(e) = 0 otherwise.
• If we have a path p from s to t, we can see it as a flow with value 1: p(e) = 1 when e is on p and p(e) = 0 otherwise.
Theorem: Let f be a circulation. Then f is a non-negative linear combination of cycles in G. Proof: ignore lower bounds. Holds if f (v, w) = 0 everywhere. Otherwise, there is a directed cycle c in
G with each edge e on c: f (e) > 0. Let r be the minimum flow of an edge on c: r = min{f (e) | e on c}. Use induction: the theorem also holds for f − cr (f − cr(e) = f (e) − r when e on c).
Corollary: Let f be a flow from s to t. f is a non-negative linear combination of cycles in G and paths from s to t. Proof: add edge from t to s with flow equal to value(f ). Now, we have a
circulation. Apply previous theorem. Cycles that use (t, s) correspond to a path from s to t. QED
Define the residual network Gf of a flow f : capacities as for maximum flow algorithms, but now additionally if f (u, v) > 0, then costf (u, v) = cost(u, v), and costf (v, u) = −cost(u, v) (you
get money back if you cancel flow. This means you can now have multiple edges (from b to a) with different costs).
• Cycle cancelling algorithm (find some flow, then make it cheaper): Make a feasible flow f in the network. While Gf has a negative cycle do
1. Find a negative cycle C in Gf (I.e. the sum of the residual costs in the cycle is negative. Find using one of Johan’s algorithms.)
2. Let D be the minimum residual capacity cf of an edge on C . Add D units of flow to each edge on C : this is a new feasible flow of smaller cost
0
Theorem: a flow f has minimum cost, iff Gf has no negative cycle. Proof: if Gf has negative cycle, then we can improve f to one with smaller cost. Suppose f is a flow, and f is an optimal
0
flow. f − f is a circulation in Gf (linear combination of cycles) and if f is not optimal, then the total cost of these cycles is negative: there is a negative cycle in this set, a cycle in Gf .
No guarantee of polynomial time. Corollary: if all capacities and target flow value are integral, then there is an optimal integer minimum cost flow (cycle cancelling then finds an integer flow).
Variant: using always the minimum mean cost cycle gives a polynomial time algorithm: cycle cancelling algorithm but finds always the minimum mean cost (divide total cycle cost by the number
2 3 2
of edges) cycle (instead of any cycle) and use that (O(nm) time to find the cycle, O(n m log n) algorithm).
Cycle cancelling algorithm also works with lower bounds: First find a feasible flow. Remaining algorithm the same, with the residual network defined as for flows with lower bounds.
• Successive shortest paths (Ford-Fulkerson, except we augment along the shortest/cheapest path): start with flow f with f (u, v) = 0 for all u, v . Repeat until value(f ) = r : 1) Find shortest
path P in Gf from s to t 2) Let q be the minimum residual capacity of an edge on P 3) Send min(q, r − value(f )) additional flow across P Possible exponential time. Only correct if G has
no cycle with negative cost. If G has a negative cycle, the algorithm may give an incorrect answer, e.g. when we have target flow r = 0.
Theorem: If G has no negative cycle, then the successive shortest paths algorithm gives the minimum cost flow. Follows from: Claim: The algorithm maintains the following invariant: f has
0
minimum cost among all flows with value value(f ). Proof: suppose we obtain f from f by sending flow over a path P . From invariant: f has minimum cost for flows with value equal to
00 0 00
value(f). Suppose f is a minimum cost flow with same value as f . f − f is a flow in Gf . Hence, it can be written as a linear combination of cycles in Gf and paths from s to t in Gf . All
cycles in Gf have cost at least 0: otherwise f is not of minimum cost. All paths in this combination have cost at least the cost of P : this is how the algorithm works (P is shortest path). It
00 0 00 0 0
follows that cost(f − f ) ≥ cost(f − f ). So, cost(f ) ≥ cost(f ) which show minimality of cost of f .
Matching Set of edges M ⊆ E in undirected G such that no vertex is endpoint of more than one edge in M . Maximal matching: no e ∈ / M with M ∪ {e} also a matching (easy, greedy algorithm).
Maximum matching: matching with M as large as possible (polynomial time, not easy, though bipartite graphs make it easier). Perfect matching: M = n/2; each vertex endpoint of edge in M (special
case of maximum matching). Variant: each edge has cost; look for perfect matching with minimum cost (also polynomial time, but harder).
Finding maximum matching in bipartite graphs: model as flow problem (connect new s to vertices in one partition, new t to all in other partition, make edges directed, capacities 1), and solve it,
make sure algorithm finds integral flow (can also model with min-cost flow problem for above variant). Another variant: b-matchings in bipartite graph: function b : V → N , look for set of edges M ,
with each v endpoint of exactly b(v) edges in M . M-augmenting path: a path with an odd number of edges that alternates between edges in and not in M , with the outer two edges not in M . When
”switching” all edged, we have one edge more in M .
Theorem: Each regular (all vertices same degree) bipartite graph has a perfect matching. Proof: construct flow model of G. Set flow of edges from s, or to t, to 1, and other edges flow to 1/d, with
d being vertex degree. This flow has value n/2, which is optimal. Ford-Fulkerson will find integral flow of value n/2; which corresponds to perfect matching.
Theorem: Let M be a matching (set of edges) in graph G (doesn’t have to be bipartite). M is a maximum matching, iff there is no M -augmenting path. Proof: • If there is an M -augmenting path,
then M is not a maximum matching. • Suppose M is not a maximum matching. Let N be a larger matching. Look at N ⊕ M = (N ∪ M ) − (N ∩ M ). Every vertex in N ⊕ M has degree 0, 1, 2
and none is connected to two edges both in M or both in N : collection of paths and cycles, which alternatingly have edge from N and from M . There must be a path in N ⊕ M with more edges
from N than from M : this is an M -augmenting path. M -alternating walk : (possibly not simple) path with edges alternating in M , and not in M . M -flower : M -alternating walk of odd length that
2
starts in an unmatched vertex, and ends as an odd cycle, a blossom. Algorithm of Edmonds, finds maximum matching in a (possibly non-bipartite) graph G in polynomial time, O(n m):
Start with some matching M , and find either M -augmenting path or M -blossom:
1.
(a) Let X be the set of unmatched vertices (b) Let Y be the set of vertices adjacent to some vertex in X Finding an M -alternating walk
(c) Build digraph D = (V, A) with {(u, v) | there is an x with (u, x) ∈ E − M and (x, v) ∈ M }
(d) Find a shortest path P from a vertex in X to a vertex in Y in D . If none is found, M is maximum
0 0
(e) Transform P to a path P in G (f) Take P ”: P , followed by an edge to X . P ” is M -alternating walk between two unmatched vertices.
(g) Look at the path P ” we just found. Two cases: Finding M -augmenting path or M -flower
(h) P ” is a simple path: it is an M -augmenting path
(i) P ” is not simple. Look to start of P ” until the first time a vertex is visited for the second time. This is an M -flower: Cycle-part of walk cannot be of even size, as it then could be removed
and we would have a shorter walk in D .
2. If we find an M -augmenting path: augment M , and obtain matching of one larger size; repeat algorithm.
3. If we find an M -blossom, we shrink it, and obtain an equivalent smaller problem; recurse. If G/B has no M/B -augmenting path, then M is maximum matching. Otherwise, expand M/B -
augmenting path to an M -augmenting path.
Let B be a set of vertices – a blossom – in G. G/B is the graph, obtained from G by contracting B to a single vertex. Theorem: Let B be an M -blossom. Then M has an augmenting path
iff M/B has an augmenting path in G/B .Proof: suppose G/B has an M/B -augmenting path P . If P does not traverse the vertex representing B : P also M -augmenting path in G. Else if P
traverses B : lift to M -augmenting path in G (case analyses show this).
Theorem: each regular bipartite graph has a perfect matching. Schrijver’s algorithm finds such a perfect matching quickly. Edge coloring model: Theorem: Let G be a bipartite graph with maximum
degree d. Then G has an edge coloring with d colors:
1. Make G regular by adding vertices and edges: suppose G has vertex sets V1 and V2 with edges only between V1 and V2 (usually called: color classes). If |V 1| > |V 2| then add |V 1| − |V 2|
isolated vertices to |V 2|. If |V2 | > |V1 | then add |V2 | − |V1 | isolated vertices to |V1 |. While not every vertex in V1 ∪ V2 has degree d:
1) Find a vertex v in V1 of degree less than d (must exist) 2) Find a vertex w in V2 of degree less than d (must exist) 3) Add the edge (v, w) to the graph
0 0 0
2. Repeatedly find a matching and remove it. For i = 1 to d do: 1) Find a perfect matching M in G . 2) Give all edges in M color i. 3) Remove all edges in M from G . (G stays regular!)
0
3. Take the edge coloring c of G . Color G in the same way: G is subgraph of G.
Fixed Parameter Complexity Often, size of problem is gigantic, but in many applications, some aspect/number may be assumed to be small: a parameter. Time of algorithm can be exponential in
this small number, but should be polynomial in usual size of problem. (Standard) parameterized graph problem: Given: graph G, integer k (the parameter), does G have a ??? of size at least/most k?
Examples of parameterized problems:
• Graph coloring: given graph G, integer (parameter) k, is there a vertex coloring of G with k colors such that no two vertices of the same color are connected, i.e. c : V → {1, ..., k} with for
n
all (v, w) ∈ E : c(v) 6= c(w)? NP-Complete starting from k = 3: O(k )
k+1
• Clique: Given graph G, integer (parameter) k, is there a clique in G of size at least k? O(n ) time with simple algorithm; O(n2.23k/3 ) with complicated one. Requires Ω(nf (k) )
k
• Vertex cover: given graph G, integer (parameter) k, is there a vertex cover (set of vertices connected to all edges) of G of size at most k? O(2 (n + m)). This is the only FPT problem.