Foundation

Prerequisites

A self-contained refresher on every concept assumed before Chapter 1 — sorting, graphs, spanning trees, heaps, Union-Find, recurrences, probability, and NP-hardness. Each topic pairs the key idea with an interactive visualisation.

1 · Asymptotic Notation

We measure algorithmic cost as a function of input size $n$, keeping only the dominant term and discarding constant factors. Three notations cover upper bound, lower bound, and tight bound.

Big-O — upper bound

$f(n) = O(g(n))$ iff there exist $c > 0, n_0$ such that $f(n) \le c \cdot g(n)$ for all $n \ge n_0$.

Read: "f grows no faster than g."

Big-Ω — lower bound

$f(n) = \Omega(g(n))$ iff $f(n) \ge c \cdot g(n)$ for some $c, n_0$ and all $n \ge n_0$.

Read: "f grows at least as fast as g."

Θ — tight bound

$f(n) = \Theta(g(n))$ iff $f(n) = O(g(n))$ and $f(n) = \Omega(g(n))$. The algorithm's growth rate is exactly $g$ up to constants. This is what we aim to state when possible.

Growth Rate Hierarchy

O(1)
O(log n)
O(n)
O(n log n)
O(n²)
O(2ⁿ)
ClassNameExampleops at n=1 000
O(1)ConstantArray indexing1
O(log n)LogarithmicBinary search~10
O(n)LinearLinear scan1 000
O(n log n)LinearithmicMerge sort~10 000
O(n²)QuadraticInsertion sort1 000 000
O(2ⁿ)ExponentialAll subsets10³⁰⁰
Why This Matters

The course constantly upgrades brute-force O(n²) solutions to O(n log n) via clever divide-and-conquer or greedy strategies. The difference at n = 10⁶ is roughly 50 000×.

2 · Sorting Algorithms

Two sorting algorithms appear repeatedly as primitives or analogies in the course: Merge Sort (worst-case O(n log n), Ch 3 basis) and Quicksort (expected O(n log n), Ch 6 Nuts & Bolts analogy).

Merge Sort

Algorithm Sketch

MergeSort(A, lo, hi): if lo ≥ hi, return. mid = ⌊(lo+hi)/2⌋. Recurse on left half, recurse on right half. Merge the two sorted halves in O(n) using two pointers.

Recurrence: $T(n) = 2T(n/2) + \Theta(n) = \Theta(n \log n)$.

Merge Sort — Step-Through Visualiser

Array: [38, 27, 43, 3, 9, 82, 10]

Press Step to begin. Orange = being compared, Green = merged/sorted.

Quicksort

Pick a pivot, partition into ≤ pivot and > pivot, recurse on each side. In-place and cache-friendly. With a random pivot, E[comparisons] = O(n log n) via the harmonic series — the same argument used in the Nuts & Bolts (Ch 6) analysis.

AlgorithmBestAverageWorstSpaceStable?
Merge SortO(n log n)O(n log n)O(n log n)O(n)Yes
QuicksortO(n log n)O(n log n)O(n²)O(log n)No
Heap SortO(n log n)O(n log n)O(n log n)O(1)No
Insertion SortO(n)O(n²)O(n²)O(1)Yes

3 · Graph Theory

A graph $G = (V, E)$ is a set of vertices $V$ and edges $E$. Graphs appear in every chapter: bipartite graphs in Ch 1, weighted graphs for MST in Ch 2, and so on.

Undirected Graph

Edges are unordered pairs $\{u,v\}$. If $(u,v) \in E$ then $(v,u) \in E$. Used for: MST, matching, clustering.

Directed Graph (Digraph)

Edges are ordered pairs $(u,v)$. Direction matters. Used for: network flow, precedence constraints, DAGs.

Key Definitions

TermDefinitionUsed in Chapter
BipartiteV splits into L, R so every edge crosses between themCh 1 (Stable Matching)
ConnectedEvery pair of vertices has a path between themCh 2 (clustering)
TreeConnected + acyclic; equivalently |E| = |V| − 1Ch 2, 3
Spanning TreeA tree subgraph touching every vertex of GCh 2 (MST)
Weighted GraphEach edge $e$ carries a weight $w(e) \in \mathbb{R}$Ch 2, 5

BFS & DFS

Graph Traversal — BFS vs DFS
Select BFS or DFS above, then press Step.
BFS — Breadth-First Search

Uses a queue. Explores all neighbours at distance $k$ before any at $k+1$. Finds shortest paths in unweighted graphs. Time: $O(V + E)$.

DFS — Depth-First Search

Uses a stack (or recursion). Goes as deep as possible before backtracking. Useful for cycle detection, topological sort, SCC. Time: $O(V + E)$.

4 · Priority Queues & Heaps

A priority queue supports insert and extract-min (or max) efficiently. The standard implementation is a binary min-heap: a complete binary tree where every node's key ≤ its children's keys.

Array Representation

Store heap in array A[1..n] (1-indexed). Parent of $i$ is $\lfloor i/2 \rfloor$; children of $i$ are $2i$ and $2i+1$. The minimum is always at A[1].

OperationTimeMethod
Insert$O(\log n)$Append, then sift-up (bubble up)
Extract-Min$O(\log n)$Swap root with last, remove, sift-down
Peek-Min$O(1)$Return root without removing
Build-Heap$O(n)$Heapify from $\lfloor n/2 \rfloor$ down to 1
Decrease-Key$O(\log n)$Update value, sift-up
Min-Heap Simulator — insert & extract
Heap is empty. Insert some values.
Course Connection

Ch 2 uses a max-heap to select the job with the latest deadline (EDF scheduling) and in the max-cardinality greedy. Without a heap you'd pay O(n) per selection → O(n²) total; the heap brings it to O(n log n).

5 · Minimum Spanning Trees

Given a connected weighted undirected graph, a minimum spanning tree (MST) is a spanning tree with minimum total edge weight. The course uses Kruskal's algorithm as the basis for k-clustering (Ch 2).

Cut Property

For any cut (S, V∖S) of the graph, the minimum-weight edge crossing the cut belongs to some MST. This is the greedy correctness argument for both Kruskal's and Prim's.

Kruskal's Algorithm

Algorithm

Sort all edges by weight. Initialise a Union-Find over V. For each edge (u,v) in order: if Find(u) ≠ Find(v), add (u,v) to MST and Union(u,v). Stop when |V|−1 edges are added.

Time: $O(E \log E)$ — dominated by sorting. Union-Find operations are nearly O(1) each.

Kruskal's MST — Step-Through
Press Step to process edges in weight order.
k-Clustering Connection (Ch 2)

To find k clusters, run Kruskal's but stop after $|V| - k$ merges (i.e., leave $k-1$ edges unprocessed). The resulting connected components are the clusters, and the spacing between them is the weight of the next edge Kruskal's would have added — which is maximised by this procedure.

6 · Union-Find (Disjoint Set Union)

Union-Find maintains a partition of $n$ elements into disjoint sets. Two operations: Find(x) returns the set representative of $x$; Union(x, y) merges $x$'s and $y$'s sets.

Path Compression

On Find(x), make every node on the path to the root point directly to the root. Flattens the tree, making future finds faster.

Union by Rank

Always attach the shallower tree under the deeper one. Keeps tree height $O(\log n)$ even without path compression.

Theorem

With both optimisations, $m$ operations on $n$ elements take $O(m \cdot \alpha(n))$ time total, where $\alpha$ is the inverse Ackermann function — effectively $O(1)$ per operation for any practical $n$.

Union-Find Simulator — 8 Elements
Union and
8 singleton sets. Union elements to merge components.

7 · Recurrences

Divide-and-conquer algorithms produce recurrences. Solving them gives the runtime. Two main tools: recursion trees and the Master Theorem.

Recursion Tree

Method

At depth $d$: there are $a^d$ subproblems each of size $n/b^d$, doing $f(n/b^d)$ work. Sum across all $\log_b n$ levels. If work per level is constant → multiply by height.

Worked Example: Merge Sort T(n) = 2T(n/2) + n

Level 0: 1 × n = n. Level 1: 2 × n/2 = n. Level 2: 4 × n/4 = n. … Level log₂n: n × 1 = n.

Each of the $\log n + 1$ levels does exactly $n$ work → $T(n) = \Theta(n \log n)$.

Master Theorem

For $T(n) = aT(n/b) + f(n)$ where $a \ge 1, b > 1$, let $c^* = \log_b a$.

Three Cases

Case 1 (leaves dominate): $f(n) = O(n^{c^* - \varepsilon})$ for some $\varepsilon > 0$ → $T(n) = \Theta(n^{c^*})$.

Case 2 (balanced): $f(n) = \Theta(n^{c^*})$ → $T(n) = \Theta(n^{c^*} \log n)$.

Case 3 (root dominates): $f(n) = \Omega(n^{c^* + \varepsilon})$ and $af(n/b) \le c \cdot f(n)$ → $T(n) = \Theta(f(n))$.

Recurrencea, bc* = log_b af(n)CaseResult
Merge Sort: 2T(n/2)+n2, 212$\Theta(n \log n)$
Binary Search: T(n/2)+11, 20n⁰=12$\Theta(\log n)$
Karatsuba: 3T(n/2)+n3, 21.5851$\Theta(n^{1.585})$
Strassen: 7T(n/2)+n²7, 22.8071$\Theta(n^{2.807})$
Edge Case: What if f(n) = n log n with a=2, b=2?

$c^* = 1$ and $f(n) = n \log n$. We need $f(n) = \Theta(n^{c^*}) = \Theta(n)$, but $n \log n \ne \Theta(n)$. The standard Master Theorem doesn't directly apply. This is the "extended case 2": $f(n) = \Theta(n^{c^*} \log^k n)$ with $k=1$, giving $T(n) = \Theta(n \log^2 n)$. The course notes mention this as an edge case.

8 · Probability Basics

Chapter 6 (Randomized Algorithms) requires a handful of probability concepts. This is everything you need.

Sample Space & Events

Ω = set of all outcomes. An event $A \subseteq \Omega$. For uniform distributions: $P(A) = |A| / |\Omega|$.

Complement & Union

$P(\bar{A}) = 1 - P(A)$. $P(A \cup B) = P(A) + P(B) - P(A \cap B)$. If disjoint: $P(A \cup B) = P(A) + P(B)$.

Independence

Events $A$ and $B$ are independent iff $P(A \cap B) = P(A) \cdot P(B)$. For $k$ independent events each with $P(A_i) = 1/2$: $P(\text{all } A_i) = (1/2)^k$. This is exactly why Freivalds' error after $k$ rounds is $(1/2)^k$.

Expected Value & Linearity

$E[X] = \sum_x x \cdot P(X = x)$. Linearity of expectation: $E[X + Y] = E[X] + E[Y]$, even if $X$ and $Y$ are not independent. This is the most-used tool in randomised algorithm analysis — it lets us sum expected costs per element without worrying about dependencies.

Error Probability under Independent Repetition

If one test has error probability $p$, after $k$ independent repetitions the combined error drops to $p^k$.

Harmonic Series — Key for Quicksort / Nuts & Bolts

Harmonic Series

$H_n = 1 + \tfrac{1}{2} + \tfrac{1}{3} + \cdots + \tfrac{1}{n} = \Theta(\ln n)$.

In Quicksort analysis: the probability that elements $i$ and $j$ ($j > i$) are ever compared equals $\frac{2}{j-i+1}$. Summing over all pairs via linearity of expectation gives $\Theta(n \log n)$ total comparisons.

9 · NP-Hardness

Chapter 5 (Approximation Algorithms) deals with NP-hard optimisation problems. We can't solve them exactly in polynomial time (unless P = NP), so we find efficient approximate solutions with guaranteed quality.

P

Decision problems solvable in worst-case polynomial time $O(n^k)$. Examples: sorting, shortest path, MST, bipartite matching.

NP

Decision problems where a "yes" certificate can be verified in polynomial time. Every P problem is in NP. Whether P = NP is the biggest open problem in CS.

NP-Complete & NP-Hard

A problem is NP-hard if every NP problem reduces to it in polynomial time — it is at least as hard as any NP problem. If it is also in NP, it is NP-complete. No poly-time algorithm is known for any NP-complete problem.

P Sorting, MST, Shortest Path NP-Complete Knapsack, 3-SAT, TSP, Load Balance NP-Hard (not in NP) Halting Problem Is P = NP? (Unknown — million-dollar question)

Course NP-Hard Problems

ProblemChapterApproachGuarantee
Load Balancing (Makespan)Ch 5LPT greedy$\le \tfrac{3}{2} \text{OPT}$
k-Center (Facility Location)Ch 5Farthest-first greedy$\le 2 \cdot \text{OPT}$
0/1 KnapsackCh 4, 5DP (pseudo-poly), FPTAS$\ge \tfrac{\text{OPT}}{1+\varepsilon}$
Why Approximation?

If $P \ne NP$ (widely believed), NP-hard problems have no poly-time exact algorithm. An $\alpha$-approximation algorithm runs in polynomial time and always returns a solution within factor $\alpha$ of OPT. The ratio $\alpha$ is the approximation guarantee.

What is a polynomial-time reduction?

Problem A reduces to Problem B if, given a solver for B, we can solve A using that solver plus polynomial preprocessing and postprocessing. If A is known to be NP-hard and A reduces to B in polynomial time, then B is NP-hard too. This is how new problems are proved hard — reduce a known NP-complete problem (e.g. 3-SAT) to the new one.

What is "pseudopolynomial" time?

An algorithm is pseudopolynomial if its runtime is polynomial in the numeric value of the input, but not in the bit length. Knapsack DP runs in $O(nW)$ — polynomial in $n$ and $W$, but $W$ can be exponential in the number of bits needed to represent it (e.g. $W = 2^{100}$). This is why Knapsack is NP-hard yet has an efficient DP for small $W$.