L2 Greedy
L2 Greedy
Lecture 2:
Greedy algorithms
Bart M. P. Jansen
Optimization problems
• For each instance there are (possibly) multiple valid solutions
• Goal is to find an optimal solution
• minimization problem:
associate cost to every solution, find min-cost solution
• maximization problem:
associate profit to every solution, find max-profit solution
2
Techniques for optimization
Optimization problems typically involve making choices
• can be applied to almost all problems, but gives very slow algorithms
• try all options for first choice,
for each option, recursively make other choices
Greedy algorithms: construct solution iteratively, make choice that seems best
3
Algorithms for optimization: how to improve on backtracking
1. try to discover structure of optimal solutions:
what properties do optimal solutions have ?
– what are the choices that need to be made ?
– do we have optimal substructure ?
optimal solution = first choice + optimal solution for subproblem
– do we have greedy-choice property for the first choice ?
4
Today: Two examples of greedy algorithms
“bla bla …”
6
Activity selection problem
What are the choices ? What properties does optimal solution have?
7
Activity selection problem
8
Proof of optimal substructure
OPT
What are the choices ? What properties does optimal solution have?
10
Proof of greedy choice property
Let be a set of activities
11
Proof of greedy choice property
Let be a set of activities
Proof. Let be an optimal solution for . If includes then the lemma obviously holds.
So it suffices to handle the case that does not include .
greedy choice
OPT
13
Proof of greedy choice property
Let be a set of activities
Proof. (...) We will show how to modify into a solution such that:
14
Algorithm for Activity Selection
Algorithm Greedy-Activity-Selection
1. if is empty
2. then return the empty set
3. else an activity from that ends first
4. all activities from that do not overlap
5. return Greedy-Activity-Selection
Correctness:
• by induction on , using optimal substructure and greedy-choice property
Running time:
• if implemented naively
• after sorting on finishing time, if implemented more cleverly
15
Intermezzo: More greedy algorithms
Lecture hall assignment
16
Intermezzo: More greedy algorithms
Minimum Spanning Tree
Greedy strategy: starting from the subgraph containing all vertices but no edges,
repeatedly add an edge of minimum weight connecting two distinct
trees
17
Intermezzo: More greedy algorithms
Disjoint paths between vertices on the exterior of a drawing
Greedy strategy: pick a path along the exterior, remove its vertices,
repeat
18
Intermezzo: More greedy algorithms
1-dimensional clustering
Greedy strategy:
o r re c t!
find the points for which the distance to the next point is largest
c
, but in
make these points the ending points of clusters
g
Temptin
19
Intermezzo: More greedy algorithms
1-dimensional clustering
1 1 +𝜖
𝑘= 2
Greedy strategy:
o r re c t!
find the points for which the distance to the next point is largest
c
, but in
make these points the ending points of clusters
g
Temptin
20
Intermezzo: More greedy algorithms
1-dimensional clustering
1 1 +𝜖
𝑘= 2
Greedy strategy:
o r re c t!
find the points for which the distance to the next point is largest
c
, but in
make these points the ending points of clusters
g
Temptin
21
Today: Two examples of greedy algorithms
“bla bla …”
0100110000010000010011000001 …
23
The encoding problem
Input: set of characters with for each character its frequency
text = 0100101100 …
Does it start with = 01001 or = 010 or … ?
24
Does a variable-length prefix encoding help?
Text: “een□voordeel”
Frequencies:
fixed-length code:
e=000 n=001 v=010 o=011 r=100 d=101 l =110 □=111
length of encoded text: bits
25
Representing prefix codes
Text: “een□voordeel”
code: e=00 n=0110 v=0111 o=010 r =100 d=101 l=110 □=111
26
Representing prefix codes
Text: “een□voordeel”
code: e=00 n=0110 v=0111 o=010 r =100 d=101 l=110 □=111
0 1
representation is binary tree T:
0 1 0 1 one leaf for each character
e internal nodes always have two outgoing edges,
0 1 0 1 0 1 labeled 0 and 1
o r d l □ code of character: follow path to leaf and list bits
0 1
n v
binary trees that are not full represent
redundant prefix codes
27
Representing prefix codes
What is the cost of encoding the text “never” with this prefix code?
bits
0 1
0 1 0 1
e cost of encoding represented by :
0 1 0 1 0 1
4
o r d l □
2 0 1 1 1 1 1
n v
1 1
frequencies
28
Designing greedy algorithms
1. try to discover structure of optimal solutions:
what properties do optimal solutions have ?
– what are the choices that need to be made ?
– do we have optimal substructure ?
optimal solution = first choice + optimal solution for subproblem
– do we have greedy-choice property for the first choice ?
29
David A. Huffman: how to make choices that determine a tree
30
Bottom-up construction of the tree
start with separate leaves, and then “merge” n-1 times until we have the tree
choices: which subtrees to merge at every step
c1 c2 c3 c4 c5 c6 c7 c8
4 2 1 1 1 1 1 1
31
Bottom-up construction of the tree
start with separate leaves, and then “merge” n-1 times until we have the tree
choices: which subtrees to merge at every step
c1 c2 c3 c4 c5 c6 c7 c8
4 2 1 1 1 1 1 1
Greedy choice: first merge two leaves with smallest character frequency
32
Lemma: Let be two characters with the lowest frequency in .
Then there is an optimal tree for where are siblings.
Thus is an optimal tree in which are siblings, and so the lemma holds.
To modify we proceed as follows.
33
How to modify ?
𝑇 𝑂𝑃𝑇
𝑐𝑘 𝑐𝑚
𝑓
(𝑐
𝑐𝑖 depth = d1 cs
𝑖
)
≤
v
𝑓
𝑘)
(𝑐
(𝑐
𝑐𝑖 𝑐 𝑘
𝑠
𝑐 𝑠 𝑐𝑚
)
depth = d2
𝑓
)≥
𝑚
(𝑐 cost decrease due to swapping and
take a deepest internal node
𝑓
𝑇 𝑂𝑃𝑇
𝑐𝑘 𝑐𝑚 depth =
𝑓
(𝑐
𝑐𝑖 cs
𝑖
)
≤
v
𝑓
𝑘)
(𝑐
(𝑐
𝑐 𝑖 𝑐 𝑘 depth =
𝑠
𝑐 𝑠 𝑐𝑚
)
𝑓
)≥
𝑚
(𝑐 cost decrease due to swapping and
take a deepest internal node
𝑓
c1 c2 c3 c4 c5 c6 c7 c8 b
4 2 1 1 1 1 1 1 2
Yes, we have a subproblem of the same type: after merging, replace merged
leaves by a single leaf with 𝑓 (𝑐 𝑖)+ 𝑓 (𝑐 𝑘)
(other way of looking at it: problem is about merging weighted subtrees)
36
Lemma: Let and be siblings in some optimal tree for set of characters.
Let , where .
Let be an optimal tree for .
Then replacing the leaf for in by an internal node with as
children results in an optimal tree for .
Proof. Based on two claims:
(i)
(ii) is a valid tree for with cost
𝑇𝐵 𝑇𝐶
𝑏 𝑣
(See Lemma 15.3.) 𝑐𝑖𝑐 𝑘
37
The algorithm
Algorithm Construct-Huffman-Tree (: set of characters)
1. if
2. then return a tree consisting of single leaf, storing the character in
3. else two characters from with lowest frequency
4. Remove from , and replace them by a new character
with . Let denote the new set of characters.
5. Construct-Huffman-Tree()
6. Replace leaf for in with internal node with as children.
7. Let be the new tree.
8. return
Correctness:
• by induction, using optimal substructure and greedy-choice property
Running time:
• by straight-forwardly following the code above
38
A faster version of the algorithm
Algorithm Faster-Huffman-Tree (: set of characters)
1.
2. build a priority queue storing , using frequency values as keys
3. for to
4. create an internal node
5. -
6. -
7.
8. insert into
9. return - (pointer to the root of the tree)
Correctness:
• resulting tree is the same as for Construct-Huffman-Tree
Running time:
• if implemented smartly (use heap)
• Sorting if implemented even smarter (hint: 2 queues)
39
Summary
• greedy algorithm solves optimization problem by:
– trying only one option for first choice (the greedy choice),
– then solving subproblem recursively
40