0% found this document useful (0 votes)
56 views116 pages

1 Greedy

CSE 202 discusses greedy algorithms and minimum spanning trees. It describes how to find the cheapest set of connections between computers using a minimum spanning tree (MST). The key properties of MSTs are that the lightest edge in any cut is included, while the heaviest edge in a cycle is not. Kruskal's algorithm finds the MST by sorting edges by weight and adding edges between different components. It uses a union-find data structure to efficiently determine components.

Uploaded by

ballechase
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views116 pages

1 Greedy

CSE 202 discusses greedy algorithms and minimum spanning trees. It describes how to find the cheapest set of connections between computers using a minimum spanning tree (MST). The key properties of MSTs are that the lightest edge in any cut is included, while the heaviest edge in a cycle is not. Kruskal's algorithm finds the MST by sorting edges by weight and adding edges between different components. It uses a union-find data structure to efficiently determine components.

Uploaded by

ballechase
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 116

CSE 202: Design and

Analysis of Algorithms

Lecture 2
!
Greedy Algorithms

• Minimum Spanning Trees


• The Union/Find Data Structure
A Network Design Problem

Problem: Given distances between a set of computers, find the


cheapest set of pairwise connections so that they are all connected.

Graph-Theoretic Formulation:
Node = Computer
3
Edge = Pair of computers
Edge Cost(u,v) = Distance(u,v) 2 2 2 1

Find a subset of edges T such that the


3
cost of T is minimum and all nodes are
connected in (V, T)

Can T contain a cycle?


Solution is connected and acyclic, so a tree
Trees

A connected, undirected and acyclic graph is called a tree

Tree Not Tree Not Tree

Property 1. A tree on n nodes has exactly n - 1 edges


Trees

A connected, undirected and acyclic graph is called a tree


Property 1. A tree on n nodes has exactly n - 1 edges
Proof. By induction.
Base Case:
n nodes, no edges,
n connected components
Inductive Case:
Add edge between two
connected components
No cycle created
#components decreases by 1

At the end: 1 component


How many edges were added?
Trees

A connected, undirected and acyclic graph is called a tree


Property 1. A tree on n nodes has exactly n - 1 edges
Is any graph on n nodes and n - 1 edges a tree?

Property 2. Any connected, undirected graph on n nodes


and n - 1 edges is a tree
Trees

A connected, undirected and acyclic graph is called a tree

Property 1. A tree on n nodes has exactly n - 1 edges


Property 2. Any connected, undirected graph on n nodes
and n - 1 edges is a tree

Proof: Suppose G is connected, undirected, has some cycles.


While G has a cycle, remove an edge from this cycle.
Result: G’ = (V, E’) where E’ is a tree. So |E’| = n - 1
Thus, E = E’, and G is a tree
Minimum Spanning Trees (MST)

Problem: Given distances between a set of computers, find the


cheapest set of pairwise connections so that they are all connected.

Graph-Theoretic Formulation:
Node = Computer
3
Edge = Pair of computers
Edge Cost(u,v) = Distance(u,v) 2 2 2 1

Find a subset of edges T such that the


3
cost of T is minimum and all nodes are
connected in (V, T)
Goal: Find a spanning tree T of the graph G with minimum total cost
We’ll see a greedy algorithm to construct T
Properties of MSTs
For a cut (S,V\S), the lightest edge in the cut is the minimum cost edge
that has one end in S and the other in V\S.

Property 1. A lightest edge in any cut always belongs to an MST

Proof. Suppose not.


Let e = lightest edge in (S,V\S), T = MST, e is not in T e’
T U {e} has a cycle with edge e’ across (S,V\S) T
Let T’ = T \ {e’} U {e}
e
cost(T’) = cost(T) + cost(e) - cost(e’) < cost(T) S V\S
Properties of MSTs

The heaviest edge in a cycle is the maximum cost edge in the cycle
Property 2. The heaviest edge in a cycle never belongs to an MST

Proof. Suppose not. Let T = MST, e = heaviest edge in some cycle, e in T


Delete e from T to get subtrees T1 and T2 e
Let e’ = lightest edge in the cut (T1,V \ T1)
T
Then, cost(e’) < cost(e)
Let T’ = T \ {e} + {e’} e’
T1 T2
cost(T’) = cost(T) + cost(e) - cost(e’) < cost(T)
Summary: Properties of MSTs

Property 1. A lightest edge in any cut always belongs to an MST

Property 2. The heaviest edge in a cycle never belongs to an MST


A Generic MST Algorithm

X = { }
While there is a cut (S, V\S) s.t. X has no edges across it
X = X + {e}, where e is the lightest edge across (S,V\S)

Does this output a tree?


At each step, no cycle is created
Continues while there are
disconnected components
S V\S
Why does this produce a MST?
A Generic MST Algorithm

X = { }
While there is a cut (S, V\S) s.t. X has no edges across it
X = X + {e}, where e is the lightest edge across (S,V\S)

Proof of correctness by induction.


e’
Base Case: At t=0, X is in some MST T
T
Induction: Assume at t=k, X is in some MST T
e
Suppose we add e to X at t=k+1
T1 T2
Suppose e is not in T. Adding e to T forms a cycle C
Let e’ = another edge in C across (S,V\S), T’ = T \ {e’} U {e}
cost(T’) = cost(T) - cost(e’) + cost(e) <= cost(T)
Thus, T’ is a MST that contains X
Kruskal’s Algorithm

X = { }
For each edge e in increasing order of weight:
If the end-points of e lie in different components in X,
Add e to X

Why does this work correctly?


Efficient Implementation: Need a data structure with properties:
- Maintain disjoint sets of nodes
- Merge sets of nodes (union)
- Find if two nodes are in the same set (find)
The Union-Find data structure
The Union-Find Data Structure
procedure makeset(x)
p[x] = x
rank[x] = 0
!
procedure find(x)
if x ≠ p[x]:
p[x] = find(p[x])
return p[x]
!
procedure union(x,y)
rootx = find(x)
rooty = find(y)
if rootx = rooty: return
if rank[rootx] > rank[rooty]:
p[rooty] = rootx
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
procedure makeset(x)
p[x] = x x x
rank[x] = 0
!
y
procedure find(x) y
if x ≠ p[x]: x
z
p[x] = find(p[x]) z
return p[x]
x
!
procedure union(x,y)
rootx = find(x)
rooty = find(y)
if rootx = rooty: return x y y
if rank[rootx] > rank[rooty]:
p[rooty] = rootx + = x
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
procedure makeset(x) makeset(a), .., makeset(h)
p[x] = x union(a, b), union(c, d), union(e, f), union(g, h),
rank[x] = 0 union(f, g), union(b, c), union(h, d), find(e)
!
procedure find(x) a b c d e f g h
if x ≠ p[x]:
p[x] = find(p[x])
return p[x]
!
procedure union(x,y)
rootx = find(x)
rooty = find(y)
if rootx = rooty: return
if rank[rootx] > rank[rooty]:
p[rooty] = rootx
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
procedure makeset(x) makeset(a), .., makeset(h)
p[x] = x union(a, b), union(c, d), union(e, f), union(g, h),
rank[x] = 0 union(f, g), union(b, c), union(h, d), find(e)
!
procedure find(x) a b c d e f g h
if x ≠ p[x]:
p[x] = find(p[x])
a b c d e f g h
return p[x]
!
procedure union(x,y)
rootx = find(x)
rooty = find(y)
if rootx = rooty: return
if rank[rootx] > rank[rooty]:
p[rooty] = rootx
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
procedure makeset(x) makeset(a), .., makeset(h)
p[x] = x union(a, b), union(c, d), union(e, f), union(g, h),
rank[x] = 0 union(f, g), union(b, c), union(h, d), find(e)
!
procedure find(x) a b c d e f g h
if x ≠ p[x]:
p[x] = find(p[x])
a b c d e f g h
return p[x] h
! g
a b c d f
procedure union(x,y) e
rootx = find(x)
rooty = find(y)
if rootx = rooty: return
if rank[rootx] > rank[rooty]:
p[rooty] = rootx
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
procedure makeset(x) makeset(a), .., makeset(h)
p[x] = x union(a, b), union(c, d), union(e, f), union(g, h),
rank[x] = 0 union(f, g), union(b, c), union(h, d), find(e)
!
procedure find(x) a b c d e f g h
if x ≠ p[x]:
p[x] = find(p[x])
a b c d e f g h
return p[x] h
! g
a b c d f
procedure union(x,y) e
rootx = find(x) d h
rooty = find(y) b c f g
if rootx = rooty: return a e
if rank[rootx] > rank[rooty]:
p[rooty] = rootx
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
procedure makeset(x) makeset(a), .., makeset(h)
p[x] = x union(a, b), union(c, d), union(e, f), union(g, h),
rank[x] = 0 union(f, g), union(b, c), union(h, d), find(e)
!
procedure find(x) a b c d e f g h
if x ≠ p[x]:
p[x] = find(p[x])
a b c d e f g h
return p[x] h
! g
a b c d f
procedure union(x,y) e
rootx = find(x) d h
rooty = find(y) b c f g
if rootx = rooty: return a e
d h
if rank[rootx] > rank[rooty]:
b c f g
p[rooty] = rootx
a e
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
procedure makeset(x) makeset(a), .., makeset(h)
p[x] = x union(a, b), union(c, d), union(e, f), union(g, h),
rank[x] = 0 union(f, g), union(b, c), union(h, d), find(e)
!
procedure find(x) a b c d e f g h
if x ≠ p[x]:
p[x] = find(p[x])
a b c d e f g h
return p[x] h
! g
a b c d f
procedure union(x,y) e
rootx = find(x) d h
rooty = find(y) b c f g
if rootx = rooty: return a e
d h
if rank[rootx] > rank[rooty]:
b c f g
p[rooty] = rootx
a e
else:
d h
p[rootx] = rooty
if rank[rootx] = rank[rooty]: b c g
a f
rank[rooty]++ e
The Union-Find Data Structure
procedure makeset(x)
p[x] = x
rank[x] = 0
Fact 1: Total time for m find
!
procedure find(x)
operations = O((m+n) log*n)
if x ≠ p[x]:
p[x] = find(p[x]) Fact 2: Time for each union operation
return p[x] = O(1) + Time(find)
!
procedure union(x,y)
rootx = find(x) Fact 3: Total time for m find and n
rooty = find(y) union ops = O((m+n)log*n)
if rootx = rooty: return
if rank[rootx] > rank[rooty]:
p[rooty] = rootx log*n = min{ k | loglog ..(k times) n = 1}
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
procedure makeset(x)
p[x] = x Property 1: If x is not a root, then
rank[x] = 0 rank[p[x]] > rank[x]
! Proof: By property of union
procedure find(x)
if x ≠ p[x]:
p[x] = find(p[x])
return p[x] Property 2: For root x, if rank[x] = k,
! then subtree at x has size >= 2k
procedure union(x,y) Proof: By induction
rootx = find(x)
rooty = find(y)
if rootx = rooty: return
if rank[rootx] > rank[rooty]: Property 3: There are at most n/2k
p[rooty] = rootx nodes of rank k
else:
p[rootx] = rooty Proof: Combining properties 1 and 2
if rank[rootx] = rank[rooty]:
rank[rooty]++
The Union-Find Data Structure
Property 1: If x is not a root, then Property 2: For root x, if rank[x] = k,
rank[p[x]] > rank[x] then subtree at x has size >= 2k
Property 3: There are at most n/2k
nodes of rank k

1 n
Interval Ik = [ k+1, k+2, .., 2k]

Break up 1..n into intervals Ik = [ k+1, k+2, .., 2k]


Example: [1], [2], [3, 4], [5,..,16], [17,..,65536],..
How many such intervals? log*n
Charging Scheme: For non-root x, if rank[x] is in Ik, set t(x) = 2k
Running time of m find operations
Property 1: If x is not a root, then Property 2: For root x, if rank[x] = k,
rank[p[x]] > rank[x] then subtree at x has size >= 2k
Property 3: There are at most n/2k
nodes of rank k

1 n
Interval Ik = [ k+1, k+2, .., 2k] #intervals = log*n

Two types of nodes in a find operation:


y
1. rank[x], rank[p[x]] lie in different intervals
z 2. rank[x], rank[p[x]] lie in same interval

x When a type 2 node is touched, its parent has higher rank


Time on a type 2 node before it becomes type 1<= 2k
Running time of m find operations
Property 1: If x is not a root, then Property 2: For root x, if rank[x] = k,
rank[p[x]] > rank[x] then subtree at x has size >= 2k
Property 3: There are at most n/2k
nodes of rank k

1 n
Interval Ik = [ k+1, k+2, .., 2k] #intervals = log*n

Two types of nodes in a find operation:


y
1. rank[x], rank[p[x]] lie in different intervals
z 2. rank[x], rank[p[x]] lie in same interval

x Total time on type 1 nodes <= m log*n


Total time on type 2 node x <= t(x) = 2k
Total time on m find operations <= m log*n+⌃ t(x)
The Union-Find Data Structure
Property 1: If x is not a root, then Property 2: For root x, if rank[x] = k,
rank[p[x]] > rank[x] then subtree at x has size >= 2k
Property 3: There are at most n/2k
nodes of rank k

1 n
Interval Ik = [ k+1, k+2, .., 2k] #intervals = log*n
Break up 1..n into intervals Ik = [ k+1, k+2, .., 2k]
Charging Scheme: If rank[x] is in Ik, set t(x) = 2k
Total time on m find operations <= m log*n+⌃ t(x)
Therefore, we need to estimate ⌃t(x)
The Union-Find Data Structure
Property 1: If x is not a root, then Property 2: For root x, if rank[x] = k,
rank[p[x]] > rank[x] then subtree at x has size >= 2k
Property 3: There are at most n/2k
nodes of rank k

1 n
Interval Ik = [ k+1, k+2, .., 2k] #intervals = log*n

Break up 1..n into intervals Ik = [ k+1, k+2, .., 2k]


Charging Scheme: If rank[x] is in Ik, set t(x) = 2k
Total time on m find operations <= mlog*n +⌃ t(x)
From Property 3, #nodes with rank in Ik is at most:
n/2k+1 + n/2k+2 + .... < n/2k
Therefore, for each interval Ik, ⌃ x in Ik t(x) <= n
As #intervals = log*n, ⌃ t(x) <= n log*n
Summary: Union-Find Data
Structure
procedure makeset(x)
p[x] = x
rank[x] = 0
!
procedure find(x)
if x ≠ p[x]:
p[x] = find(p[x]) Property 1: Total time for m find
return p[x] operations = O((m+n) log*n)
!
procedure union(x,y)
rootx = find(x) Property 2: Time for each union
rooty = find(y) operation = O(1) + Time(find)
if rootx = rooty: return
if rank[rootx] > rank[rooty]:
p[rooty] = rootx
else:
p[rootx] = rooty
if rank[rootx] = rank[rooty]:
rank[rooty]++
Summary: Kruskal’s Algorithm
Running Time

X = { }
For each edge e in increasing order of weight:
If the end-points of e lie in different components in X,
Add e to X

Sort the edges = O(m log m) = O(m log n)


Add e to X = Union Operation = O(1) + Time(Find)
Check if end-points of e lie in different components = Find Operation

Total time = Sort + O(n) Unions + O(m) Finds = O(m log n)


With sorted edges, time = O(n) Unions + O(m) Finds = O(m log*n)
31
MST Algorithms

• Kruskal’s Algorithm: Union-Find Data Structure


• Prim’s Algorithm: How to Implement?
Prim’s Algorithm

X = { }, S = {r}
Repeat until S has n nodes:
Pick the lightest edge e in the cut (S,V - S)
Add e to X
Add v, the end-point of e in V - S to S

S V-S
Prim’s Algorithm

X = { }, S = {r}
Repeat until S has n nodes:
Pick the lightest edge e in the cut (S,V - S)
Add e to X
Add v, the end-point of e in V - S to S

How to implement Prim’s algorithm?


!
Need data structure for edges with the operations:
1. Add an edge
2. Delete an edge
3. Report the edge with min weight
Data Structure: Heap

key(x 4 8

9 7

Heap Property: If x is the parent of y, then key(x) <= key(y)


A heap is stored as a balanced binary tree
Height = O(log n), where n = # nodes
Heap: Reporting the min

key(x 4 8

9 7

Heap Property: If x is the parent of y, then key(x) <= key(y)


Heap: Reporting the min

key(x 4 8

9 7

Heap Property: If x is the parent of y, then key(x) <= key(y)


Report the root node
Time = O(1)
Heap: Add an item

key(x 4 8

9 7 1

Heap Property: If x is the parent of y, then key(x) <= key(y)


Add item u to the end of the heap
If heap property is violated, swap u with its parent
Heap: Add an item

key(x 4 8

9 7 1

Heap Property: If x is the parent of y, then key(x) <= key(y)


Add item u to the end of the heap
If heap property is violated, swap u with its parent
Heap: Add an item

key(x 4 1

9 7 8

Heap Property: If x is the parent of y, then key(x) <= key(y)


Add item u to the end of the heap
If heap property is violated, swap u with its parent
Heap: Add an item

key(x 4 1

9 7 8

Heap Property: If x is the parent of y, then key(x) <= key(y)


Add item u to the end of the heap
If heap property is violated, swap u with its parent
Heap: Add an item

key(x 4 2

9 7 8

Heap Property: If x is the parent of y, then key(x) <= key(y)


Add item u to the end of the heap
If heap property is violated, swap u with its parent
Heap: Add an item

key(x 4 2

9 7 8

Heap Property: If x is the parent of y, then key(x) <= key(y)


Add item u to the end of the heap
If heap property is violated, swap u with its parent
Time = O(log n)
Heap: Delete an item

key(x 4 2

9 7 8

Heap Property: If x is the parent of y, then key(x) <= key(y)


Delete item u
Move v, the last item to u’s position
Heap: Delete an item

key(x 4 2

9 7

Heap Property: If x is the parent of y, then key(x) <= key(y)


If heap property is violated:
Case 1. key[v] > key[child[v]]
Case 2. key[v] < key[parent[v]]
Heap: Delete an item

key(x 4 2

9 7

Heap Property: If x is the parent of y, then key(x) <= key(y)


If heap property is violated:
Case 1. key[v] > key[child[v]]
Swap v with its lowest key child
Heap: Delete an item

key(x 4 8

9 7

Heap Property: If x is the parent of y, then key(x) <= key(y)


If heap property is violated:
Time = O(log n)
Case 1. key[v] > key[child[v]]
Swap v with its lowest key child
Continue until heap property holds
Heap: Delete an item
2

8 4

key(x
9 8 5 5

9 9 8 9 5

Heap Property: If x is the parent of y, then key(x) <= key(y)


If heap property is violated:
Time = O(log n)
Case 2. key[v] < key[parent[v]]
Swap v with its parent
Continue till heap property holds
Heap: Delete an item
2

8 4

key(x
5 8 5 5

9 9 8 9

Heap Property: If x is the parent of y, then key(x) <= key(y)


If heap property is violated:
Time = O(log n)
Case 2. key[v] < key[parent[v]]
Swap v with its parent
Continue till heap property holds
Heap: Delete an item
2

8 4

key(x
5 8 5 5

9 9 8 9

Heap Property: If x is the parent of y, then key(x) <= key(y)


If heap property is violated:
Time = O(log n)
Case 2. key[v] < key[parent[v]]
Swap v with its parent
Continue till heap property holds
Heap: Delete an item
2

5 4

key(x
8 8 5 5

9 9 8 9

Heap Property: If x is the parent of y, then key(x) <= key(y)


If heap property is violated:
Time = O(log n)
Case 2. key[v] < key[parent[v]]
Swap v with its parent
Continue till heap property holds
Summary: Heap

key(x 4 8

9 7

Heap Property: If x is the parent of y, then key(x) <= key(y)


Operations:
Add an element: O(log n)
Delete an element: O(log n)
Report min: O(1)
Prim’s Algorithm
X = { }, S = {r}
Repeat until S has n nodes:
Pick the lightest edge e in the cut (S,V - S)
Add e to X
Add v, the end-point of e in V - S to S

Use a heap to store edges between S and V - S


At each step:
1. Pick lightest edge with a report-min
2. Delete all edges b/w v and S from heap
3. Add all edges b/w v and V - S - {v}
S V-S
#edge additions and deletions = O(m) (Why?) Black edges = in heap
#report mins = O(n)
Prim’s Algorithm
X = { }, S = {r}
Repeat until S has n nodes:
Pick the lightest edge e in the cut (S,V - S)
Add e to X
Add v, the end-point of e in V - S to S

Use a heap to store edges b/w S and V - S


At each step:
1. Pick lightest edge with a report-min Heap Ops:
2. Delete all edges b/w v and S from heap !
3. Add all edges b/w v and V - S - {v} Add: O(log n)
Delete: O(log n)
#edge additions and deletions = O(m) Report min: O(1)
#report mins = O(n)
Total running time = O(m log n)
Summary: Prim’s Algorithms

X = { }, S = {r}
Repeat until S has n nodes:
Pick the lightest edge e in the cut (S,V - S)
Add e to X
Add v, the end-point of e in V - S to S

Implementation: Store edges from S to V - S using a heap


Running Time: O(m log n)
MST Algorithms

• Kruskal’s Algorithm: Union-Find Data Structure


• Prim’s Algorithm: How to Implement?
• An Application of MST: Single Linkage Clustering
Single Linkage Clustering

Problem: Given a set of points, build


a hierarchical clustering
!

Procedure:
Initialize: each node is a cluster
Until we have one cluster:
Pick two closest clusters C, C*
Merge S = C U C*
!
Distance between two clusters:
d(C, C*) = minx in C, y in C* d(x, y)
!
Can you recognize this algorithm?
!
Greedy Algorithms

• Direct argument - MST


• Exchange argument - Caching
• Greedy approximation algorithms
Optimal Caching

Main Memory Cache

Given a sequence of memory accesses, limited cache:


How do you decide which cache element to evict?

Note: We are given future memory accesses for this problem,


which is usually not the case. This is for an application of greedy
algorithms
Optimal Caching: Example

M a b c b c b a a Memory Access Sequence

a
S1 Cache Contents
b
E - Evicted items

Given a sequence of memory accesses, limited cache size,


How do you decide which cache element to evict?
Goal: Minimize #main memory fetches
Optimal Caching: Example

M a b c b c b a a Memory Access Sequence

a a
S1 Cache Contents
b b
E - - Evicted items

Given a sequence of memory accesses, limited cache size,


How do you decide which cache element to evict?
Goal: Minimize #main memory fetches
Optimal Caching: Example

M a b c b c b a a Memory Access Sequence

a a a
S1 Cache Contents
b b c
E - - b Evicted items

Given a sequence of memory accesses, limited cache size,


How do you decide which cache element to evict?
Goal: Minimize #main memory fetches
Optimal Caching: Example

M a b c b c b a a Memory Access Sequence

a a a b
S1 Cache Contents
b b c c
E - - b a Evicted items

Given a sequence of memory accesses, limited cache size,


How do you decide which cache element to evict?
Goal: Minimize #main memory fetches
Optimal Caching: Example

M a b c b c b a a Memory Access Sequence

a a a b b
S1 Cache Contents
b b c c c
E - - b a - Evicted items

Given a sequence of memory accesses, limited cache size,


How do you decide which cache element to evict?
Goal: Minimize #main memory fetches
Optimal Caching: Example

M a b c b c b a a Memory Access Sequence

a a a b b b
S1 Cache Contents
b b c c c a
E - - b a - c Evicted items

Given a sequence of memory accesses, limited cache size,


How do you decide which cache element to evict?
Goal: Minimize #main memory fetches
Optimal Caching: Example

M a b c b c b a a Memory Access Sequence

a a a b b b b b
S1 Cache Contents
b b c c c a a a
E - - b a - c - - Evicted items

Given a sequence of memory accesses, limited cache size,


How do you decide which cache element to evict?
Goal: Minimize #main memory fetches
Optimal Caching

M a b c b c b a a Memory Access Sequence

a
S1 Cache Contents
b
E - Evicted items

Farthest-First (FF) Schedule: Evict an item when


needed. Evict the element which is accessed
farthest down in the future
Theorem: The FF algorithm minimizes #fetches
Optimal Caching

M a b c b c b a a Memory Access Sequence

a a
S1 Cache Contents
b b
E - - Evicted items

Farthest-First (FF) Schedule: Evict an item when


needed. Evict the element which is accessed
farthest down in the future
Theorem: The FF algorithm minimizes #fetches
Optimal Caching

M a b c b c b a a Memory Access Sequence

a a c
S1 Cache Contents
b b b
E - - a Evicted items

Farthest-First (FF) Schedule: Evict an item when


needed. Evict the element which is accessed
farthest down in the future
Theorem: The FF algorithm minimizes #fetches
Optimal Caching

M a b c b c b a a Memory Access Sequence

a a c c
S1 Cache Contents
b b b b
E - - a - Evicted items

Farthest-First (FF) Schedule: Evict an item when


needed. Evict the element which is accessed
farthest down in the future
Theorem: The FF algorithm minimizes #fetches
Optimal Caching

M a b c b c b a a Memory Access Sequence

a a c c c
S1 Cache Contents
b b b b b
E - - a - - Evicted items

Farthest-First (FF) Schedule: Evict an item when


needed. Evict the element which is accessed
farthest down in the future
Theorem: The FF algorithm minimizes #fetches
Optimal Caching

M a b c b c b a a Memory Access Sequence

a a c c c b
S1 Cache Contents
b b b b b c
E - - a - - - Evicted items

Farthest-First (FF) Schedule: Evict an item when


needed. Evict the element which is accessed
farthest down in the future
Theorem: The FF algorithm minimizes #fetches
Optimal Caching

M a b c b c b a a Memory Access Sequence

a a c c c b b
S1 Cache Contents
b b b b b c a
E - - a - - - c Evicted items

Farthest-First (FF) Schedule: Evict an item when


needed. Evict the element which is accessed
farthest down in the future
Theorem: The FF algorithm minimizes #fetches
Optimal Caching

M a b c b c b a a Memory Access Sequence

a a c c c b b b
S1 Cache Contents
b b b b b c a a
E - - a - - - c - Evicted items

Farthest-First (FF) Schedule: Evict an item when


needed. Evict the element which is accessed
farthest down in the future
Theorem: The FF algorithm minimizes #fetches
Caching: Reduced Schedule

M a b c b c b a a Memory Access Sequence

a a c c c b b b
S1 Cache Contents
b b b b b c a a
E - - a - - - c - Evicted items

An eviction schedule is reduced if it fetches an item x only


when it is accessed
Fact: For any S, there is a reduced schedule S* which makes
at most as many fetches as S
Caching: Reduced Schedule
An eviction schedule is reduced if it fetches an item x only when it is accessed
Fact: For any S, there is a reduced schedule S* with at most as many fetches as S

M a b c b c b a a

a a a b b b b b
S1 Non-reduced
b b c c c a a a
a a a b b b b b
S2 Reduced
b b c c c c a a

To convert S to S*: Be lazy!


Caching: FF Schedules
Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF
from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Sj a, b

SFF a, b

Sj+1 a, b

t=j Cache t=j


Caching: FF Schedules
Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF
from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Sj a, b a, b

SFF a, b a, b

Sj+1 a, b a, b

t=j t=j+1 Cache t=j Cache t=j+1

Case 1: No cache miss at t=j+1. Sj+1 = Sj


Caching: FF Schedules
Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF
from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Sj a, b a, c

SFF a, b a, c

Sj+1 a, b a, c

t=j t=j+1 Cache t=j Cache t=j+1

Case 2: Cache miss at t=j+1, Sj and SFF evict same item. Sj+1 = Sj
Caching: FF Schedules
Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF
from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Sj c, b d, c

SFF a, c ...

Sj+1 a, c d, c

t=j t=j+1 t=q Cache t=j+1 Cache t=q


Case 3a: Cache miss at t=j+1. Sj evicts a, SFF evicts b. Sj+1 also evicts b.
Next there is a request to d, and Sj evicts b. Make Sj+1 evict a, bring in d.
Caching: FF Schedules
Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF
from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Sj c, b c, a

SFF a, c ...

Sj+1 a, c c, a

t=j t=j+1 t=q Cache t=j+1 Cache t=q


Case 3b: Cache miss at t=j+1. Sj evicts a, SFF evicts b. Sj+1 also evicts b
Next there is a request to a, and Sj evicts b. Sj+1 does nothing.
Caching: FF Schedules
Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF
from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Sj c, b d, b

SFF a, c ...

Sj+1 a, c d, b

t=j t=j+1 t=q Cache t=j+1 Cache t=q


Case 3c: Cache miss at t=j+1. Sj evicts a, SFF evicts b. Sj+1 also evicts b
Next there is a request to a, and Sj evicts d. Sj+1 evicts d and brings in b.
Now convert Sj+1 to the reduced version of this schedule.
Caching: FF Schedules
Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF
from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Sj c, b d, b

SFF a, c ...

Sj+1 a, c d, b

t=j t=j+1 t=q Cache t=j+1 Cache t=q

Case 3d: Cache miss at t=j+1. Sj evicts a, SFF evicts b. Sj+1 also evicts b
Next there is a request to b. Cannot happen as a is accessed before b!
Summary: Optimal Caching
Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF
from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Case 1: No cache miss at t=j+1. Sj+1 = Sj


Case 2: Cache miss at t=j+1, Sj and SFF evict same item. Sj+1 = Sj
Case 3a: Cache miss at t=j+1. Sj evicts a, SFF evicts b. Sj+1 also evicts b.
Next there is a request to d, and Sj evicts b. Make Sj+1 evict a, bring in d.
Case 3b: Cache miss at t=j+1. Sj evicts a, SFF evicts b. Sj+1 also evicts b
Next there is a request to a, and Sj evicts b. Sj+1 does nothing.
Case 3c: Cache miss at t=j+1. Sj evicts a, SFF evicts b. Sj+1 also evicts b
Next there is a request to a, and Sj evicts d. Sj+1 evicts d and brings in b.
Now convert Sj+1 to the reduced version of this schedule.
Case 3d: Cache miss at t=j+1. Sj evicts a, SFF evicts b. Sj+1 also evicts b
Next there is a request to b. Cannot happen as a is accessed before b!
Summary: Optimal Caching

Theorem: Suppose a reduced schedule Sj makes the same decisions as SFF


from t=1 to t=j. Then, there exists a reduced schedule Sj+1 s.t:
1. Sj+1 makes same decision as SFF from t=1 to t=j+1
2. #fetches(Sj+1) <= #fetches(Sj)

Suppose you claim a magic schedule schedule SM makes less fetches than SFF
Then, we can construct a sequence of schedules:
SM = S0, S1, S2, ..., Sn = SFF such that:
(1) Sj agrees with SFF from t=1 to t = j
(2) #fetches(Sj+1) <= #fetches(Sj)

What does this say about #fetches(SFF) relative to #fetches(SM)?


Greedy Algorithms

• Direct argument - MST


• Exchange argument - Caching
• Greedy approximation algorithms
Greedy Approximation Algorithms

• k-Center
• Set Cover
Approximation Algorithms

• Optimization problems, eg, MST, Shorest paths


• For an instance I, let:
• A(I) = value of solution by algorithm A
• OPT(I) = value of optimal solution
• Approximation ratio(A) = max A(I)/OPT(I)
I

• A is an approx. algorithm if approx-ratio(A) is bounded


k-Center Problem

Given n towns on a map


Find how to place k shopping malls such that:
Drive to the nearest mall from any town is shortest
k-Center Problem

Given n towns on a map


Find how to place k shopping malls such that:
Drive to the nearest mall from any town is shortest
k-Center Problem

Given n points in a metric space


Find k centers such that distance between any point and its
closest center is as small as possible

Metric Space:
Point set w/ distance fn d

Properties of d:
!
•d(x, y) >= 0
•d(x, y) = d(y, x)
•d(x, y) <= d(x, z) + d(y, z)
NP Hard in general
Greedy Algorithm: Furthest-first
traversal
1. Pick C = {x}, for an arbitrary point x
2. Repeat until C has k centers:
Let y maximize d(y, C), where
d(y, C) = minx in C d(x, y)
C = C U {y}
Greedy Algorithm: Furthest-first
traversal
1. Pick C = {x}, for an arbitrary point x
2. Repeat until C has k centers:
Let y maximize d(y, C), where
d(y, C) = minx in C d(x, y)
C = C U {y}
Greedy Algorithm: Furthest-first
traversal
1. Pick C = {x}, for an arbitrary point x
2. Repeat until C has k centers:
Let y maximize d(y, C), where
d(y, C) = minx in C d(x, y)
C = C U {y}
Greedy Algorithm: Furthest-first
traversal
1. Pick C = {x}, for an arbitrary point x
2. Repeat until C has k centers:
Let y maximize d(y, C), where
d(y, C) = minx in C d(x, y)
C = C U {y}
Greedy Algorithm: Furthest-first
traversal
1. Pick C = {x}, for an arbitrary point x
2. Repeat until C has k centers:
Let y maximize d(y, C), where
d(y, C) = minx in C d(x, y)
C = C U {y}
Furthest-first Traversal

Is furthest-first traversal always optimal?

Theorem: Approx. ratio of furthest-first traversal is 2


Furthest-first(FF) Traversal
Theorem: Approx. ratio of FF-traversal is 2
Metric Space:
Define, for any instance: r = max x d(x, C)
Point set w/ distance fn d

Properties of d:
!
•d(x, y) >= 0
•d(x, y) = d(y, x)
•d(x, y) <= d(x, z) + d(y, z)
For a set S,
d(x, S) = miny in Sd(x, y)
Property 1. Solution value of FF-traversal = r
Property 2. There are at least k+1 points S s.t
FF-traversal: each pair has distance >= r
Pick C = {x}, arbitrary x Property 3. Any k-center solution must assign at
Repeat until C has k centers: least two points x, y in S to the same center c
Let y maximize d(y, C)
C = C U {y} What is Max(d(x, c), d(y, c)) ?
Furthest-first(FF) Traversal
Theorem: Approx. ratio of FF-traversal is 2
Metric Space:
Define, for any instance: r = max x d(x, C)
Point set w/ distance fn d

Properties of d:
!
•d(x, y) >= 0
•d(x, y) = d(y, x)
•d(x, y) <= d(x, z) + d(y, z)
For a set S,
d(x, S) = miny in Sd(x, y)
Property 3. Any k-center solution must assign at
least two points x, y in S to the same center c
FF-traversal: What is max(d(x, c), d(y, c)) ?
Pick C = {x}, arbitrary x
Repeat until C has k centers: From property of d,
d(y,c) d(x,y)
Let y maximize d(y, C) d(x,c) + d(y, c) >= d(x, y)
C = C U {y} max(d(x,c), d(y,c)) >= d(x,y)/2
d(x,c)
Furthest-first(FF) Traversal
Theorem: Approx. ratio of FF-traversal is 2
Metric Space:
Define, for any instance: r = max x d(x, C)
Point set w/ distance fn d

Properties of d:
!
•d(x, y) >= 0
•d(x, y) = d(y, x)
•d(x, y) <= d(x, z) + d(y, z)
For a set S,
d(x, S) = miny in Sd(x, y)
Property 1. Solution value of FF-traversal = r
Property 2. There are at least k+1 points S s.t
FF-traversal: each pair has distance >= r
Pick C = {x}, arbitrary x Property 3. Any k-center solution must assign at
Repeat until C has k centers: least two points x, y in S to the same center c
Let y maximize d(y, C)
Max(d(x, c), d(y, c)) >= d(x,y)/2 >= r/2
C = C U {y}
Property 4. Any other solution has value >= r/2
Applications:

• Facility-location problems
• Clustering
• Initialization step in clustering problems
e.g, k-means++
Greedy Approximation Algorithms

• k-Center
• Set Cover
Set Cover Problem
Given:
• Universe U with n elements
• Collection C of sets of elements of U
!
Find the smallest subset C* of C that covers all of U

NP Hard in general
Set Cover Problem
Given:
• Universe U with n elements
• Collection C of sets of elements of U
!
Find the smallest subset C* of C that covers all of U

NP Hard in general
Applications

• Sensor placing problems


• Facility location problems
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*

Greedy: #sets=7
A Greedy Set-Cover Algorithm

C* = { }
Repeat until all of U is covered:
Pick the set S in C with highest # of uncovered elements
Add S to C*

Greedy: #sets=7
OPT: #sets=5
Greedy Set-Cover Algorithm
Theorem: If optimal set cover has k sets, then greedy selects <= k ln n sets

Greedy Algorithm:
C* = { }
Repeat until U is covered:
Pick S in C with highest # of uncovered elements
Add S to C*

Define:
n(t) = #uncovered elements after step t in greedy
!
Property 1: There is some S that covers at least
n(t)/k of the uncovered elements
!
Property 2: n(t+1) <= n(t)(1 - 1/k)
!
Property 3: n(T) <= n(1 - 1/k)T < 1,
when T = k ln n
Greedy Algorithms

• Direct argument - MST


• Exchange argument - Caching
• Greedy approximation algorithms
• k-center, set-cover

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy