Chapter 2 · Greedy Algorithms

01 What is a Greedy Algorithm?

As the notes say: "It is hard, if not impossible, to define precisely what is meant by a greedy algorithm." But here's the intuition:

Core Idea

Greedy Algorithm

An algorithm is greedy if it builds a solution in small steps, choosing a decision at each step myopically to optimize some underlying criterion — without looking ahead or reconsidering past choices.

🎯

Myopic Choice

At each step, pick what looks best right now. Never look back, never reconsider.

🔍

Local → Global

When it works, local optimality implies global optimality. This is the interesting structural insight.

⚠️

Proof Required

"It's easy to invent greedy algorithms — proving they work is the challenge."

Chapter roadmap

We study three problems in this chapter: (1) minimize maximum lateness when scheduling all tasks, (2) maximize the number of tasks completed on time, and (3) find the k-clustering with maximum spacing. Each requires a different greedy insight and a different proof technique.

02 Scheduling with Minimal Lateness

Setting: A single resource (processor, machine, person) handles n tasks one at a time. Every task must be run — but some may finish after their deadline, incurring lateness. We want to minimize the worst lateness.

Problem Setup

Minimize Maximum Lateness

Each task aᵢ has a processing time tᵢ and a deadline dᵢ. We schedule all tasks in non-overlapping intervals starting at time 0. If task i finishes at time fᵢ, its lateness is:

\[ l_i = \max(0,\ f_i - d_i) \]

Goal: find an ordering that minimizes L = max_i lᵢ (the maximum lateness across all tasks).

The Greedy Rule: Earliest Deadline First (EDF)

Forget about processing times entirely. Sort tasks by deadline and run them in that order — left to right, no gaps.

Algorithm — EDF

Earliest Deadline First

Sort tasks so that d₁ ≤ d₂ ≤ … ≤ dₙ
Schedule task 1 first, then task 2, then … no idle time
Compute the lateness of each task: lᵢ = max(0, fᵢ − dᵢ)

The algorithm runs in O(n log n) time (dominated by the sort).

Worked Example

Three tasks with the following parameters:

Task	Processing Time (t)	Deadline (d)
A	2	3
B	3	5
C	4	8

EDF order: A → B → C (sorted by deadline 3, 5, 8). Use the simulator below to explore what happens with different orderings.

03 Interactive EDF Scheduler

Compare orderings and see how maximum lateness changes. EDF is highlighted in green — try the others to understand why it wins.

📅 Gantt Chart Scheduler Interactive

Choose ordering:

⚡

EDF is optimal

This ordering (A→B→C sorted by deadline) gives maximum lateness = 1. Every other ordering produces lateness ≥ 1. Try all 6 permutations above!

04 Proof — The Exchange Argument

How do we prove EDF is optimal? We use an exchange argument — the same "bubble sort" style reasoning used throughout algorithm analysis.

Theorem

EDF produces an optimal schedule (minimum maximum lateness).

Full Proof — Bubble Sort Argument

Let 𝒫 denote the EDF schedule. Suppose there is an optimal schedule 𝒪 ≠ 𝒫. Both can be assumed to have no idle time (idle time never helps). Since 𝒪 ≠ 𝒫, there exist two consecutively scheduled tasks aᵢ and aⱼ in 𝒪 where i > j (i.e., aᵢ appears before aⱼ but has a later deadline: dᵢ ≥ dⱼ).

Let 𝒪' be the schedule obtained by swapping aᵢ and aⱼ. We claim max lateness doesn't increase:

• All other tasks are unaffected.
• Task aⱼ finishes earlier in 𝒪' → its lateness ↓ or stays 0.
• Task aᵢ finishes at time fⱼ (the old finish time of aⱼ) in 𝒪'. If it's not late, great. If it is late: l'ᵢ = fⱼ − dᵢ ≤ fⱼ − dⱼ = lⱼ (since dᵢ ≥ dⱼ). So its lateness is at most the lateness of aⱼ in the original schedule 𝒪.

Therefore swapping never increases maximum lateness. Starting from 𝒪, we can reach 𝒫 by at most C(n,2) swaps (cf. bubble sort), and max lateness never increases. So 𝒫 is also optimal.

■ QED

The Key Swap — Visualized

Whenever two consecutive tasks are out of deadline order, swapping them never makes things worse:

❌ Before swap — Wrong order

aᵢ (d=9)

aⱼ (d=6)

aⱼ misses deadline!

aᵢ runs first (later deadline), aⱼ ends up late

✅ After swap — Deadline order

aⱼ (d=6)

aᵢ (d=9)

Max lateness ↓

aⱼ runs first (earlier deadline), aᵢ's lateness ≤ aⱼ's old lateness

Proof technique: Exchange argument

This is a recurring proof pattern: show that any optimal solution can be transformed into our greedy solution via a sequence of local swaps, each of which doesn't make things worse. Therefore greedy is also optimal.

05 Scheduling Without Lateness

Now a harder variant: instead of minimizing lateness, we want to complete as many tasks as possible without any lateness at all. We must select a subset of tasks.

Problem Setup

Maximum Cardinality Feasible Subset

Given tasks A = {a₁,…,aₙ} each with processing time tᵢ and deadline dᵢ, find the largest subset of tasks that can all be completed on time (no task is late).

Key difference

In the previous problem, every task was scheduled (some could be late). Here, we choose which tasks to skip entirely to maximize the count of on-time completions.

The Greedy Algorithm — Sorted Deadline + Max Heap

The algorithm considers tasks in deadline order. It greedily adds each task, but if the running total exceeds the deadline, it drops the longest task seen so far.

Algorithm

Greedy Maximum Cardinality Scheduler

Sort tasks by deadline: d₁ ≤ d₂ ≤ … ≤ dₙ H ← empty max-heap (keyed by processing time) total ← 0 for i = 1 to n: H.insert(aᵢ) // always add task first total ← total + tᵢ if total > dᵢ: // total time exceeds deadline ak ← H.removeMax() // drop the longest task total ← total - t_k return tasks in H, scheduled in deadline order

Runtime: O(n log n) — sorting + n heap operations each O(log n).

Worked Example

Task	Processing Time (t)	Deadline (d)
P	4	5
Q	3	5
R	2	6
S	1	6

Tasks already sorted by deadline. Step through the algorithm in the simulator below.

06 Interactive Heap Scheduler

Watch the greedy algorithm process tasks one by one. The max-heap holds the current candidate set. When the total time overflows a deadline, the longest task gets ejected.

🗂️ Max-Heap Scheduler Interactive

Step: —

Total time in H: 0

Press Next to begin. The algorithm considers tasks P, Q, R, S in deadline order.

Current Candidate Set H (Max-Heap by t)

Empty

Rejected Tasks

None yet

Simulator ready. Press Next to begin.

07 Proof of Optimality

Theorem

The greedy algorithm finds a maximum-cardinality feasible subset.

Proof — Mathematical Induction

Base case: n = 1 is trivial.

Inductive step: Assume the claim holds for n−1 tasks. Consider n tasks.

If H contains all n tasks, it's clearly optimal. Otherwise, suppose task aₖ was the first task removed when we added some task aᵢ. This means the total processing time of tasks {a₁,…,aᵢ} exceeds dᵢ, and aₖ was the longest among them.

Key claim: There is an optimal solution O that does not include aₖ. Suppose some optimal O' does include aₖ. Since Σⱼ₌₁ᵢ tⱼ > dᵢ, not all of a₁,…,aᵢ can be in O'. So some task aₗ (1≤l≤i) is absent from O'. Replace aₖ with aₗ in O': since tₖ ≥ tₗ (aₖ was max), the total processing time only decreases. All tasks a₁,…,aᵢ remain feasible, and tasks after aᵢ are scheduled no later than before (since tₖ ≥ tₗ means the swap doesn't push them out). So the resulting O is also optimal and excludes aₖ.

By the induction hypothesis, the greedy algorithm on A\{aₖ} is optimal for A\{aₖ} — and it produces exactly H. Since O ⊆ A\{aₖ}, |H| = |O|, so H is optimal for the original problem.

■ QED

Proof technique: Greedy stays ahead

We show that for each task the greedy rejects, there's an "equivalent" replacement — so the greedy never falls behind optimal in count. The inductive structure lets us handle one task at a time.

08 Clustering

"Clustering arises whenever one has a collection of objects — for example, photographs, documents, or microorganisms — that one is trying to classify or organize into coherent groups."

Setting

k-Clustering Problem

Given n objects U = {p₁,…,pₙ} with a distance function d(pᵢ,pⱼ) ≥ 0 (symmetric, zero iff same), and an integer k, partition U into k non-empty groups C₁,…,Cₖ such that the spacing is maximized.

Definition

Spacing of a k-Clustering

\[\text{spacing}(C) = \min\{d(p_i, p_j) \mid p_i \in C_s,\ p_j \in C_t,\ s \neq t\}\]

The spacing is the minimum distance between any two points in different clusters. Maximizing this means clusters are as "far apart" as possible.

Why spacing?

We want inter-cluster distances to be large and intra-cluster distances to be small. Maximizing spacing captures this: it ensures the closest pair of objects from different clusters is as far apart as possible.

The Greedy Algorithm — Kruskal's MST

Algorithm

Maximum Spacing k-Clustering

// Build a weighted complete graph on the n objects Run Kruskal's MST algorithm on this graph, but stop when exactly k connected components remain. // Equivalently: Build the full MST, then remove the (k−1) most expensive edges. return the resulting k connected components as clusters.

The spacing of the resulting clustering is the weight of the (k−1)th most expensive MST edge, call it d*.

🌲

Kruskal builds bridges

Greedily add shortest edges that don't form a cycle. When k components remain, the last k−1 edges added to complete the MST are exactly the "longest bridges" — cutting them maximizes spacing.

✂️

Cut the k−1 longest

After building the full MST, removing the k−1 most expensive edges splits the tree into exactly k subtrees. These subtrees form the k clusters.

09 Interactive Cluster Visualizer

Watch Kruskal's algorithm build the MST edge by edge, then see the k-clustering emerge when the k−1 longest MST edges are cut.

🗂️ k-Clustering via Kruskal Interactive

k =

Press Next to start building the MST. Edges are added in order of increasing weight.

Candidate edge

MST edge (kept)

Cut edge (spacing boundary)

10 Proof of Optimality

Theorem

The Kruskal-based k-clustering has maximum spacing.

Full Proof

Let C = {C₁,…,Cₖ} be the k-clustering produced by the algorithm. The spacing of C is d* — the (k−1)th most expensive MST edge weight.

Consider any other k-clustering C' = {C'₁,…,C'ₖ}. We must show spacing(C') ≤ d*.

Since C ≠ C', there is some cluster Cᵣ ∈ C that is not a subset of any cluster in C'. Therefore Cᵣ contains two points pᵢ and pⱼ that lie in different clusters in C' — say pᵢ ∈ C'ₛ and pⱼ ∈ C'ₜ with s ≠ t.

Since pᵢ and pⱼ ended up in the same cluster Cᵣ in C, Kruskal must have connected them via a path P in the MST before stopping. Every edge on P has weight ≤ d* (since all MST edges were added before the k−1 heaviest ones were "cut").

Now pᵢ ∈ C'ₛ but pⱼ ∉ C'ₛ. Walking along path P from pᵢ to pⱼ, there must be a first vertex p' that is not in C'ₛ. Let p be the vertex immediately before p' on P. Then d(p, p') ≤ d*, and p ∈ C'ₛ while p' ∈ C'ₜ for some t ≠ s. So d(p, p') is a distance between different clusters in C'.

Therefore: spacing(C') ≤ d(p, p') ≤ d*.

■ QED

Proof technique: Contradiction via path argument

Any alternative clustering must "cross" some MST edge of weight ≤ d*, because Kruskal connected all same-cluster points via cheap edges. So no other clustering can have spacing > d*.

Intuition: Why Kruskal Works for Clustering

Same cluster → cheap path

Any two objects in the same cluster are connected by a path where every edge weighs ≤ d*. Kruskal merged them before the expensive cuts.

Different clusters → cheap crossing

Any alternative clustering that doesn't agree with ours must have some pair of nearby points in different clusters — bounded by d*. So its spacing can't beat ours.

Chapter Summary

Problem	Greedy Rule	Data Structure	Complexity	Proof Technique
Min-Lateness Scheduling	Earliest Deadline First	Sort	O(n log n)	Exchange argument
Max-Cardinality Feasible	Sort by deadline + drop longest	Max-heap	O(n log n)	Induction + replacement
Max-Spacing k-Clustering	Kruskal MST + cut k−1 edges	Union-Find	O(n² log n)	Path argument

The greedy philosophy

Invent: identify the "obviously right" local rule. Verify: prove it's globally optimal via exchange arguments, induction, or contradiction. Beware: greedy fails silently — always prove correctness, never assume it.

🔄

Exchange Argument

Show any optimal solution can be transformed into the greedy solution via swaps, without getting worse. Powerful for scheduling.

📐

Induction + Replace

Show the greedy's excluded element can always be swapped into any optimal solution. Reduces the problem size at each step.

🛤️

Path / Cut Argument

For graph problems: use the structure of MST paths to bound what any competing solution can achieve.