0% found this document useful (0 votes)
8 views271 pages

Rakin Sir Merged

The document discusses sorting algorithms, focusing on Insertion Sort and Merge Sort. It explains the mechanics of Insertion Sort, its worst-case performance of O(n²), and introduces Merge Sort as a more efficient divide-and-conquer approach with a running time of O(n log n). The document also includes pseudocode and a proof outline for the efficiency of Merge Sort compared to Insertion Sort.

Uploaded by

malihakhan1030
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views271 pages

Rakin Sir Merged

The document discusses sorting algorithms, focusing on Insertion Sort and Merge Sort. It explains the mechanics of Insertion Sort, its worst-case performance of O(n²), and introduces Merge Sort as a more efficient divide-and-conquer approach with a running time of O(n log n). The document also includes pseudocode and a proof outline for the efficiency of Merge Sort compared to Insertion Sort.

Uploaded by

malihakhan1030
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 271

Divide And Conquer

Merge Sort, Quick Sort


Sorting
• Arrange an unordered list of elements in some
order.
• Some common algorithms
• Bubble Sort
• Insertion Sort
• Merge Sort
• Quick Sort
Sorting
• Important primitive
• For today, we’ll pretend all elements are distinct.

6 4 3 8 1 5 2 7

1 2 3 4 5 6 7 8

Length of the list is n


Insertion Sort (Recap)
Insertion Sort 6 4 3 8 5
example

Insertion-Sort(A, n) Start by moving A[1] toward


for i = 1 to n – 1 the beginning of the list until
key = A[i]
j = i – 1 you find something smaller
while j >= 0 and A[j] > key (or can’t go any further):
A[j + 1] = A[j]
j = j – 1
A[j + 1] = key

6 4 3 8 5
4 6 3 8 5
Insertion Sort 6 4 3 8 5
example

Insertion-Sort(A, n)
for i = 1 to n – 1
Then move A[2]:
key = A[i] key = 3
j = i – 1
while j >= 0 and A[j] > key
A[j + 1] = A[j]
4 6 3 8 5
j = j – 1
A[j + 1] = key
4 6 6 8 5
4 4 6 8 5
3 4 6 8 5
Insertion Sort 6 4 3 8 5
example

Insertion-Sort(A, n)
for i = 1 to n – 1
Then move A[3]:
key = A[i] key = 8
j = i – 1
while j >= 0 and A[j] > key
A[j + 1] = A[j]
j = j – 1
A[j + 1] = key 3 4 6 8 5
3 4 6 8 5
Insertion Sort 6 4 3 8 5
example

Insertion-Sort(A, n)
for i = 1 to n – 1
Then move A[4]:
key = A[i] key = 5
j = i – 1
while j >= 0 and A[j] > key
A[j + 1] = A[j]
j = j – 1 3 4 6 8 5
A[j + 1] = key

3 4 6 8 8
3 4 6 6 8
3 4 5 6 8
Insertion Sort 6 4 3 8 5
example
Start by moving A[1] toward
the beginning of the list until
you find something smaller
(or can’t go any further): Then move A[3]:

6 4 3 8 5 3 4 6 8 5
4 6 3 8 5 3 4 6 8 5
Then move A[2]: Then move A[4]:
4 6 3 8 5 3 4 6 8 5
3 4 6 8 5 3 4 5 6 8
Then we are done!
Why does this work?

• Say you have a sorted list, 3 4 6 8 , and


another element 5 .

• Insert 5 right after the largest thing that’s still


smaller than 5 . (Aka, right after 4 ).

• Then you get a sorted list: 3 4 5 6 8


This sounds like a job for…

Proof By
Induction!
Outline of a proof by induction
Let A be a list of length n
• Base case:
• A[:1] is sorted at the end of the 0’th iteration. ✓
• Inductive Hypothesis:
• A[:i+1] is sorted at the end of the ith iteration (of the outer loop).
• Inductive step:
• For any 0 < k < n, if the inductive hypothesis holds for i=k-1, then it holds
for i=k.
• Aka, if A[:k] is sorted at step k-1, then A[:k+1] is sorted at step k
(previous slide)
• Conclusion:
• The inductive hypothesis holds for i = 0, 1, …, n-1.
• In particular, it holds for i=n-1.
• At the end of the n-1’st iteration (aka, at the end of the algorithm), A[:n] =
A is sorted.
• That’s what we wanted! ✓
Worst-case Analysis
• In this class we will use worst-case analysis:
• We assume that a “bad guy” produces a worst-case
input for our algorithm, and we measure performance
on that worst-case input.

• How many operations are performed by the


insertion sort algorithm on the worst-case input?
How fast is InsertionSort?
• Let’s count the number of operations!
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current

By my count*…
• 2𝑛2 − 𝑛 − 1 variable assignments
• 2𝑛2 − 𝑛 − 1 increments/decrements
• 2𝑛2 − 4𝑛 + 1 comparisons
• … *A complete count of the operation will be insignificant from
later discussion.
In this class we will use…
• Big-Oh notation!
• Gives us a meaningful way to talk about the
running time of an algorithm, independent of
programming language, computing platform, etc.,
without having to count all the operations.
Main idea:
Focus on how the runtime scales with n (the input size).

(Heuristically: only pay attention to the


Some examples… largest function of n that appears.)

Asymptotic Running
Number of operations
Time
1
⋅ 𝑛2 + 100 𝑂 𝑛2
10

0.063 ⋅ 𝑛2 − .5 𝑛 + 12.7 𝑂 𝑛2

100 ⋅ 𝑛1.5 − 1010000 𝑛 𝑂 𝑛1.5


We say this algorithm is
“asymptotically faster”
11 ⋅ 𝑛 log 𝑛 + 1 𝑂 𝑛 log 𝑛 than the others.
Informal definition for O(…)
• Let 𝑇 𝑛 , 𝑔 𝑛 be functions of positive integers.
• Think of 𝑇 𝑛 as a runtime: positive and increasing in n.

• We say “𝑇 𝑛 is 𝑂 𝑔 𝑛 ” if:
for all large enough n,
𝑇 𝑛 is at most some constant multiple of 𝑔 𝑛 .

Here, “constant” means “some number


that doesn’t depend on n.”
Formal definition of O(…)
• Let 𝑇 𝑛 , 𝑔 𝑛 be functions of positive integers.
• Think of 𝑇 𝑛 as a runtime: positive and increasing in n.

• Formally,
𝑇 𝑛 =𝑂 𝑔 𝑛
“If and only if” “For all”

∃𝑐 > 0, 𝑛0 𝑠. 𝑡. ∀𝑛 ≥ 𝑛0 ,
“There exists” 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)
Ω(…) means a lower bound
• We say “𝑇 𝑛 is Ω 𝑔 𝑛 ” if, for large enough n,
𝑇 𝑛 is at least as big as a constant multiple of 𝑔 𝑛 .

• Formally,
𝑇 𝑛 =Ω 𝑔 𝑛

∃𝑐 > 0 , 𝑛0 𝑠. 𝑡. ∀𝑛 ≥ 𝑛0 ,
𝑐⋅𝑔 𝑛 ≤𝑇 𝑛
Switched these!!
Θ(…) means both!
• We say “𝑇 𝑛 is Θ 𝑔(𝑛) ” iff both:

𝑇 𝑛 =𝑂 𝑔 𝑛
and

𝑇 𝑛 =Ω 𝑔 𝑛
Insertion Sort: running time
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current: n-1 iterations
A[j+1] = A[j] of the outer
j -= 1 loop
A[j+1] = current

In the worst case, about n iterations of this inner loop

The running time of insertion sort is 𝑂 𝑛2 .


InsertionSort is an algorithm that
correctly sorts an arbitrary n-element
array in time 𝑂 𝑛2 .

Can we do better?
Can we do better?
• MergeSort: a divide-and-conquer approach

Big problem

Smaller Smaller
problem problem
Recurse! Recurse!

Yet smaller Yet smaller Yet smaller Yet smaller


problem problem problem problem
Divide-And-Conquer
• Divide
• Divide the problem into one or more smaller instances
of the same problem.
• Conquer
• Solve them smaller problems recursively.
• Combine
• Merge/ combine the solutions to solve the original
problem.
Why insertion sort works? (Recap)

• Say you have a sorted list, 3 4 6 8 , and


another element 5 .

• Insert 5 right after the largest thing that’s still


smaller than 5 . (Aka, right after 4 ).
• Then you get a sorted list: 3 4 5 6 8
• What if you have two
sorted lists? 3 4 6 8 2 5 9 11
MergeSort
6 4 3 8 1 5 2 7

6 4 3 8 1 5 2 7
Recursive magic! Recursive magic!

3 4 6 8 1 2 5 7

MERGE! 1 2 3 4 5 6 7 8
MergeSort Pseudocode
MERGESORT(A):
• n = length(A)
• if n ≤ 1: If A has length 1,
It is already sorted!
• return A
Sort the left half
• L = MERGESORT(A[0 : n/2])
Sort the right half
• R = MERGESORT(A[n/2 : n])
• return MERGE(L, R) Merge the two halves
What actually happens?
First, recursively break up the array all the way down to the
base cases

6 4 3 8 1 5 2 7

6 4 3 8 1 5 2 7

6 4 3 8 1 5 2 7

6 4 3 8 1 5 2 7
This array of
length 1 is
sorted!
Then, merge them all back up!
Sorted sequence!

1 2 3 4 5 6 7 8
Merge!

3 4 6 8 1 2 5 7
Merge! Merge!

4 6 3 8 1 5 2 7
Merge! Merge! Merge! Merge!

6 4 3 8 1 5 2 7
A bunch of sorted lists of length 1 (in the order of the original sequence).
Does it work?
• Yet another job for proof by induction!!!
• Try it yourself.
Assume that n is a power of 2
for convenience.
It’s fast
CLAIM:
MergeSort runs in time 𝑂 𝑛 log 𝑛

• Proof coming soon.


• But first, how does this compare to InsertionSort?
• Recall InsertionSort ran in time O 𝑛2 .
• log 𝑛 grows much more slowly than 𝑛
• 𝑛 log 𝑛 grows much more slowly than 𝑛2
Assume that n is a power of 2
for convenience.
Now let’s prove the claim
CLAIM:
MergeSort runs in time 𝑂 𝑛 log 𝑛
Let’s prove the claim
Size n Level 0

n/2 n/2 Level 1

n/4 n/4 n/4 n/4


Focus on just one of
these sub-problems …
Level t
2t subproblems
n/2t n/2t n/2t n/2t n/2t n/2t
at level t.

(Size 1) Level log(n)


How much work in this sub-problem?

Time spent MERGE-ing


n/2t the two subproblems

+
Time spent within the
n/2t+1 n/2t+1 two sub-problems
How much work in this sub-problem?
Let k=n/2t…

Time spent MERGE-ing


k the two subproblems

+
Time spent within the
k/2 k/2 two sub-problems
How long does it k
take to MERGE? k/2 k/2

Answer: It takes time O(k), since we just walk across the list once.

k/2 k/2

3 4 6 8 1 2 5 7

MERGE! 1 2 3 4 5 6 7 8

k
Recursion tree
Size n

n/2 n/2

n/4 n/4 n/4 n/4

n/2t n/2t n/2t n/2t n/2t n/2t There are O(k) operations
done at this node.

k
k/2 k/2
(Size 1)
Recursion tree
How many operations are done at this level of the
Size n tree? (Just MERGE-ing subproblems).

How about at this level of the tree?


n/2 n/2 (between both n/2-sized problems)

n/4 n/4 This level?


n/4 n/4


This level?
n/2t n/2t n/2t n/2t n/2t n/2t There are O(k) operations
done at this node.

k
k/2 k/2
(Size 1)
Work this out yourself!
Recursion tree #
Size of
each
Amount of work
Level problems problem at this level

Size n
0 1 n O(n)
n/2 n/2
1 2 n/2 O(n)
n/4 n/4 n/4 n/4 2 4 n/4 O(n)
… …
n/2t n/2t n/2t n/2t n/2t
n/2t t 2t n/2t O(n)
… …
log(n) n 1 O(n)
(Size 1)
Total runtime…

• O(n) steps per level, at every level

• log(n) + 1 levels

• O( n log(n) ) total!

That was the claim!


What have we learned?
• MergeSort correctly sorts a list of n integers in time
O(n log(n) ).
• That’s (asymptotically) better than InsertionSort!
Can we do better?
• Any deterministic compare-based sorting algorithm
must make Ω(n log n) compares in the worst-case.
• How to prove this?

• Is there any other way to sort an array efficiently?


QuickSort
QuickSort: another divide-and-conquer approach

• Divide
• Partition the array A[1:n] into two (possibly empty) subarrays
A[1:q-1] (the low side) and A [q+1:n] (the high side)
• Each element in the low side of the partition is <=A[q]
Each element in the high side is of the partition >= A[q].
• Compute the index q of the pivot as part of this partitioning
procedure.
• Conquer
• Recursively sort the subarrays A[1:q-1] and A[q+1:n]
• Combine
• Already sorted
Pseudocode of QuickSort
QUICKSORT(A, p, r)
if p < r
q = PARTITION(A, p, r)
QUICKSORT(A, p, q – 1)
QUICKSORT(A, q+1, r)
PARTITION
PARTITION(A, p, r)
x = A[r]
i = p – 1
for j = p to r – 1
if A[j] <= x
i = i + 1
exchange A[i] with A[j]
exchange A[i + 1] with A[r]
return i + 1
Example of PARTITION

i = -1

7 6 3 5 1 2 4 Pick 4 as a pivot

j = 0 j = 1j = 2
Example of PARTITION

i = -1i = 0

3 6 7 5 1 2 4 Pick 4 as a pivot

j=3j=4
Example of PARTITION

i=0i=1

3 1 7 5 6 2 4 Pick 4 as a pivot

j=5
Example of PARTITION

i=1i=2

3 1 2 5 6 7 4 Pick 4 as a pivot
Example of PARTITION

Low Side High Side

3 1 2 4 6 7 5 Set the pivot in


place
Example of recursive calls
7 6 3 5 1 2 4 Pick 4 as a pivot

Partition on either side of 4


3 1 2 4 6 7 5
Recurse on [312] Recurse on [67] and pick 7
and pick 2 as a 1 2 3 4 5 6 7 as a pivot.
pivot.

1 2 3 4 5 6 7
QuickSort Runtime Analysis
𝑇 𝑛 = The worst-case running time on a problem of
size n

• Worst-case partitioning

• Best-case partitioning
Recurrences
• An equation that describes a function in terms of
its value on other, typically smaller, arguments.

• Recursive Case
• Involves the recursive invocation of the function on different
(usually smaller) inputs
• Base Case
• Does not involve a recursive invocation
Algorithmic Recurrences
• A recurrence 𝑇 𝑛 is algorithmic if, for every
sufficiently large threshold constant 𝑛0 < 𝑛, the
following two properties hold,
Solving Recurrences
• Substitution Method
• Guess a solution
• Use mathematical induction to prove the guess
Substitution Method
𝑛
𝑇 𝑛 = 2𝑇 + Θ(𝑛)
2
• Guess
• 𝑇 𝑛 = 𝑂(𝑛𝑙𝑔𝑛)

• We need to prove,
• 𝑇 𝑛 ≤ 𝑐𝑛𝑙𝑔𝑛 for all, 𝑛0 ≤ 𝑛
• For specific choice of 𝑐 > 0 and n0 > 0
Substitution Method
• Inductive Hypothesis
• 𝑇 𝑛′ ≤ 𝑐𝑛′𝑙𝑔𝑛′ for all n0 < 𝑛′ < 𝑛 and 2n0 ≤ 𝑛
𝑛
𝑇 𝑛 = 2𝑇 +Θ 𝑛
𝑛 2 𝑛
𝑇 𝑛 ≤ 2c lg + Θ(𝑛)
2𝑛 2
𝑇 𝑛 ≤ cnlg + Θ 𝑛
2
𝑇 𝑛 ≤ cnlg 𝑛 − 𝑐𝑛𝑙𝑔2 + Θ(𝑛)
𝑇 𝑛 ≤ cnlg 𝑛 − 𝑐𝑛 + Θ(𝑛)

𝑇 𝑛 ≤ cnlg 𝑛 If n0 is sufficiently large and cn dominates Θ(𝑛)


Substitution Method
• Base Case
• 𝑛0 ≤ 𝑛 < 2𝑛0

• Assuming
• 𝑛0 = 2
• 𝑐 = max(𝑇 2 , 𝑇(3))

• We get 𝑇 𝑛 ≤ 𝑐𝑛𝑙𝑔𝑛
Solving Recurrences
• Substitution Method
• Guess a solution
• Use mathematical induction to prove the guess

• Making a good guess may be difficult


• Need to be careful about common pitfalls.
Recursion Tree
• Consider the recurrence
𝑛
𝑇 𝑛 =3𝑇 + Θ 𝑛2
4
Recursion Tree
• Consider the recurrence
𝑛
𝑇 𝑛 =3𝑇 + Θ 𝑛2
4
Recursion Tree
Recursion Tree
Recursion Tree
• Unbalanced recursion tree
• Estimate the height of the tree
• Estimate the cost from each level
• Estimate the number of leave nodes of the tree
• Estimate the cost from all the leaf nodes over all the
levels

• Example
Master Theorem
• Let’s consider the following recurrence relation,

a>0 b>1 Driving Function


Master Theorem
• Let’s consider the following recurrence relation,

a>0 b>1 Driving Function


Master Theorem
𝑛
𝑇 𝑛 =9𝑇 +𝑛
3
• 𝑎 = 9, 𝑏 = 3
• 𝑓 𝑛 = 𝑛 = 𝑂 𝑛1 = 𝑂 𝑛 𝑐
• log 𝑏 𝑎 = 𝑙𝑜𝑔3 9 = 2 > 𝑐

𝑇 𝑛 = Θ(𝑛2 )
Master Theorem
2𝑛
𝑇 𝑛 = 𝑇 +1
3
2
• 𝑎 = 1, 𝑏 =
3
• 𝑓 𝑛 = 1 = 𝑂 𝑛0 = 𝑂 𝑛𝑐
• log 𝑏 𝑎 = 𝑙𝑜𝑔2/3 1 = 0 = 𝑐

𝑇 𝑛 = Θ 𝑛0 𝑙𝑔𝑛 = Θ(𝑙𝑔𝑛)
Master Theorem
𝑛
𝑇 𝑛 = 3𝑇 + 𝑛 lg 𝑛
4
• 𝑎 = 3, 𝑏 = 4
• 𝑓 𝑛 = 𝑛 lg 𝑛 > 𝑛 = Ω 𝑛1 = Ω(𝑛𝑐 )
• log 𝑏 𝑎 = 𝑙𝑜𝑔4 3 < 𝑐
• Can we apply case 3?

• Need to satisfy the regularity condition.


Master Theorem
𝑛
𝑇 𝑛 = 3𝑇 + 𝑛 lg 𝑛
4
• Regularity condition

𝑛 3𝑛 3 3
• 𝑎𝑓 = lg ≤ 𝑛 lg 𝑛 ≤ 𝐶𝑓 𝑛
𝑏 4 4 4

𝑇 𝑛 = Θ(𝑛 lg 𝑛)
Master Theorem
𝑛
𝑇 𝑛 = 2𝑇 + 𝑛 lg 𝑛
2

• Applying master theorem (case 2),

𝑇 𝑛 = Θ(𝑛 lg 2 𝑛)
MergeSort Runtime Analysis
(Revisit)
𝑛
𝑇 𝑛 = 2𝑇 + Θ(𝑛)
2

• Applying master theorem (case 2),

𝑇 𝑛 = Θ(𝑛 lg 𝑛)
QuickSort Runtime Analysis
(Revisit)
• Best-case partitioning

• Applying master theorem, 𝑇 𝑛 = Θ(𝑛 𝑙𝑜𝑔𝑛)

• Worst-case partitioning

• Expanding the recurrence (Rec. tree): 𝑇 𝑛 = Θ(𝑛2 )


Multiplying Square Matrices
• A and B are square matrices
• Find out 𝐶 = 𝐴 ∗ 𝐵

• Running Time Θ(𝑛3 )


Can we do better?
Multiplying Square Matrices
• Divide Assuming 𝑛 = 2𝑥

• Conquer
Multiplying Square Matrices

Θ(1)

𝑛
8𝑇( )
2
Multiplying Square Matrices
• Running Time
𝑛
𝑇 𝑛 =8𝑇 +Θ 1
2

• Applying master theorem (case 1),

𝑇 𝑛 = Θ 𝑛3

Can we do better?
Strassen’s Algorithm
• Intuitions,
• Addition is faster than multiplications.
• Θ 𝑛2 𝑣𝑠 Θ(𝑛3 )

• Replace multiplications with additions


• Remember 𝑥 2 − 𝑦 2 = (𝑥 + 𝑦 )(𝑥 − 𝑦)
Strassen’s Algorithm
• Divide (same as before)
• Conquer
• Perform 10 additions
Strassen’s Algorithm
• Divide (same as before)
• Conquer
• Perform 10 additions, 𝑆1 , 𝑆2 , … , 𝑆10
• Perform 7 multiplications
Strassen’s Algorithm
• Divide (same as before)
• Conquer
• Perform 10 additions, 𝑆1 , 𝑆2 , … , 𝑆10
• Perform 7 multiplications 𝑃1 , 𝑃2 , … , 𝑃7
• Compute 𝐶𝑖𝑗 with 𝑆 and 𝑃 matrices
Strassen’s Algorithm Runtime
Analysis
• Running Time
𝑛
𝑇 𝑛 =7𝑇 + Θ 𝑛2
2

• Applying master theorem (case 1),

𝑇 𝑛 = Θ 𝑛log2 7 = Θ(𝑛2.80755… )
Reference
• Introduction to Algorithms, CLRS, 4th edition.
• Chapter 2 (Getting Started)
• Section 2.1, 2.3
• Chapter 4 (Divide-and-Conquer)
• Sections 4.1 – 4.5
• Chapter 7 (Quick Sort)
• Sections 7.1 and 7.2

• Additional Resource:
• mastertheorem.pdf
Dynamic Programming
Fibonacchi Numbers
𝑓𝑖𝑏 𝑛 = 𝑓𝑖𝑏 𝑛 − 1 + 𝑓𝑖𝑏 𝑛 − 2
• Write a program to find the nth fibonacchi number.
def Fibonacci(n):
if n == 0: return 0
if n == 1: return 1
return Fibonacci(n-1) + Fibonacci(n-2)

• Runtime analysis
𝑇 𝑛 = 𝑇 𝑛 − 1 + 𝑇 𝑛 − 2 + 𝑂(1)
𝑇 𝑛 = Ω(2𝑛/2 )
What’s going on? That’s a lot of
repeated
Consider Fib(8) computation!
8

6 7

4 5 5 6

2 3 3 4 3 4 4 5

0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4

0 1
1 2 1 2 ……
0 1 0 1 1 2 0 1 0 1 0 1 1 2
0 1 0 1 0 1 0 1
Fibonacchi Numbers
• How to avoid repeated computations?
• Store already computed results

global fib[n] initialized to -1/−∞/NIL


def Fibonacci(n):
if n == 0: return 0
if n == 1: return 1
if fib[n] not computed:
fib[n] = Fibonacci(n-1) + Fibonacci(n-2)
return fib[n]
Fibonacchi Numbers
• How to avoid repeated computations?
• Start from smaller problems and proceed towards bigger ones.

def Fibonacci(n):
local fib[n+1]
fib[0] = 0
fib[1] = 1
for i -> 2 to n:
fib[i] = fib[i-1] + fib[i-2]
return fib[n]
What did we do?

Dynamic Programming!!!
Dynamic Programming
• It is an algorithm design paradigm
• like divide-and-conquer is an algorithm design paradigm.
• Usually, it is for solving optimization problems
• E.g., shortest path, minimum/maximum profit, longest sequences
• (Fibonacci numbers aren’t an optimization problem, but they are a good
example of DP anyway…)
Elements of dynamic programming
1. Optimal Sub-structure Property
• Big problems break up into sub-problems.
• The solution to a problem can be expressed in terms of solutions to
smaller sub-problems.
• Fibonacci:
𝑓𝑖𝑏 𝑛 = 𝑓𝑖𝑏 𝑛 − 1 + 𝑓𝑖𝑏(𝑛 − 2)
Elements of dynamic programming
2. Overlapping Sub-Problem Property
• The sub-problems overlap.
• Fibonacci:
• Both fib[i+1] and fib[i+2] directly use fib[i].
• And lots of different fib[i+x] indirectly use fib[i].
• This means that we can save time by solving a sub-problem just
once and storing the answer.
Elements of dynamic programming
1. Optimal substructure.
• Optimal solutions to sub-problems can be used to find the optimal
solution of the original problem.
2. Overlapping subproblems.
• The subproblems show up again and again

• Using these properties, we can design a dynamic programming


algorithm:
• Keep a table of solutions to the smaller problems.
• Use the solutions in the table to solve bigger problems.
• At the end we can use information we collected along the way to
find the solution to the whole thing.
Two ways to think about and/or implement DP
algorithms
• Top-down Approach with • Bottom-up Approach
Memoization
global fib[n] initialized to - def Fibonacci(n):
1/−∞/NIL local fib[n+1]
def Fibonacci(n): fib[0] = 0
if n == 0: return 0 fib[1] = 1
if n == 1: return 1 for i -> 2 to n:
if fib[n] not computed: fib[i] = Fibonacci(i-1) +
fib[n] = Fibonacci(n-1) + Fibonacci(i-2)
Fibonacci(n-2) return fib[n]
return fib[n]
Top-down Approach
• Think of it like a recursive algorithm.
• To solve the big problem:
• Recurse to solve smaller problems
• Those recurse to solve smaller problems
• etc..

• The difference from divide and conquer:


• Keep track of what small problems you’ve already solved to prevent
re-solving the same problem twice.
• Aka, “memo-ization”
Top-down Approach
8

6 7

4 5 5 6

2 3 3 4 3 4 4 5

0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4

1 2 etc
0 1 1 2
0 1 0 1 1 2 0 1 0 1 0 1 1 2
0 1 0 1 0 1 0 1
13
Top-down Approach
Nodes Pruned
8

6 7

4 5 5 6

2 3 3 4 3 4 4 5

0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4

1 2 etc
0 1 1 2
0 1 0 1 1 2 0 1 0 1 0 1 1 2
0 1 0 1 0 1 0 1
14
Bottom-up Approach 8

7
• Solve the small problems first
• fill in fib[0],fib[1]
6
• Then bigger problems
• fill in fib[2] 5
•…
4
• Then bigger problems
• fill in fib[n-1] 3

• Then finally solve the real problem. 2

• fill in fib[n] 1

0
Rod-cutting Problem
• Given
• A rod of length n
• A table of prices 𝑝𝑖 for 𝑖 = 1, 2, … , 𝑛,
• Determine the maximum revenue 𝑟𝑛 obtainable by cutting up the
rod and selling all the pieces.
• For example,
Rod-cutting Problem
Rod-cutting Problem
• An optimal solution involving k pieces,
• Each piece has length 𝑖1 , 𝑖2 , 𝑖3, … , 𝑖𝑘
• 𝑛 = 𝑖1 + 𝑖2 + 𝑖3 + ⋯ + 𝑖𝑘
• The optimal revenue, 𝑟𝑛 = 𝑝𝑖1 + 𝑝𝑖2 + ⋯ + 𝑝𝑖𝑘
Rod-cutting Problem
• Once an initial cut is made,
• The two resulting smaller pieces will be cut independently
• Smaller instance of the rod-cutting problem
• Optimal Sub-structure Property

• Different pieces can be cut into same length pieces (on not)
• Overlapping Sub-structure Property
Rod-cutting Problem
• Assuming an initial cut is made,
• 𝑟𝑛 = max(𝑝𝑛 , 𝑟1 + 𝑟𝑛−1 , 𝑟2 + 𝑟𝑛−2 , … 𝑟𝑛−1 + 𝑟1 )

• Further assuming the initial cut is always the leftmost cut,


• A first piece followed by some decomposition of the remainder
• First piece is not further divided only the remaining is decomposed
• 𝑟𝑛 = max(𝑝𝑖 + 𝑟𝑛−𝑖 : 1 ≤ 𝑖 ≤ 𝑛)
Rod-cutting Problem

Runtime O(2𝑛−1 )
Rod-cutting Problem (Top-down Approach)
Rod-cutting Problem (Bottom-up Approach)

Runtime 𝑂(𝑛2 )
Rod-cutting Problem (Bottom-up Approach)
• Reconstruct the choices that led to the optimal solution

• Store the choices that lead to the optimal solution


• Store the optimal size i of the first piece to cut off when solving a
subproblem of size j
Rod-cutting Problem (Bottom-up Approach)
Rod-cutting Problem (Bottom-up Approach)
Matrix Chain Multiplication
• Given a chain of n matrices, 𝐴1, 𝐴2 , … , 𝐴𝑛
• Compute the product with standard multiplication algorithm
• Goal: Minimize the number of scalar multiplications

• Example,
• 𝐴1 , 𝐴2 , 𝐴3 with dimensions 10 * 100, 100 * 5, 5 * 50
• ((𝐴1 , 𝐴2 ), 𝐴3 ) perfoms a total of 7500 scalar multiplication
• (𝐴1 , (𝐴2 , 𝐴3 )) perfoms a total of 75000 scalar multiplication
Matrix Chain Multiplication
• Parenthesizing resolves ambiguity in multiplication order
• Fully parenthesized chain of matrices
• Either a single matrix
• Or the product of two fully parenthesized matrix products, surrounded by
parentheses
• Example,
• < 𝐴1 , 𝐴2 , 𝐴3 , 𝐴4 > can be parenthesized in 5 distinct ways.
• (𝐴1 , (𝐴2 , (𝐴3 , 𝐴4 ))), (𝐴1 , ((𝐴2 , 𝐴3 ), 𝐴4 )), ((𝐴1 , 𝐴2 ), (𝐴3 , 𝐴4 )),
((𝐴1 , (𝐴2 , 𝐴3 )), 𝐴4 ), (((𝐴1 , 𝐴2 ), 𝐴3 ), 𝐴4 )
Matrix Chain Multiplication
• Given a chain of n matrices, 𝐴1, 𝐴2 , … , 𝐴𝑛
• Matrix 𝐴𝑖 has dimensions 𝑝𝑖−1 ∗ 𝑝𝑖
• Goal: Fully parenthesize the product to minimize the number of scalar
multiplications

• Note: determine the order of multiplication not the product itself.


Matrix Chain Multiplication
• Fully parenthesized product
• Split the product into fully parenthesized subproducts

• Let, the first split occurs between kth and (k+1)st matrices

• How many possible ways? Ω(2𝑛 )


Matrix Chain Multiplication
• Let, 𝑚[𝑖, 𝑗] = The minimum number of scalar multiplications needed
to compute the matrix 𝐴𝑖:𝑗

• We need to find 𝑚[1, 𝑛]


Matrix Chain Multiplication
• The optimal parenthesization
• Split the product 𝐴𝑖:𝑗 between 𝐴𝑘 and 𝐴𝑘+1 for some value of 𝑖 ≤ 𝑘 < 𝑗

𝑚 𝑖, 𝑗 = 𝑚 𝑖, 𝑘 + 𝑚 𝑘 + 1, 𝑗 + 𝑝𝑖−1 ∗ 𝑝𝑘 ∗ 𝑝𝑗
• We need to consider all such splits, i.e., all values of k
Matrix Chain Multiplication
• The optimal parenthesization
• Split the product 𝐴𝑖:𝑗 between 𝐴𝑘 and 𝐴𝑘+1 for some value of 𝑖 ≤ 𝑘 < 𝑗

𝑚 𝑖, 𝑗 = 𝑚 𝑖, 𝑘 + 𝑚 𝑘, +1 𝑗 + 𝑝𝑖−1 ∗ 𝑝𝑘 ∗ 𝑝𝑗
• We need to consider all such splits, i.e., all values of k

• How many such 𝐴𝑖:𝑗 subproblems? Θ(𝑛2 )


Matrix Chain Multiplication
Matrix Chain Multiplication
Matrix Chain Multiplication
𝐴1 𝐴2 𝐴3 𝐴4 𝐴5 𝐴6
30 * 35 35 * 15 15 * 5 5 *10 10 * 20 20 * 25
j\i 1 2 3 4 5 6
1 0

2 15750 0 m[1, 3] = min(


• m[1,1] + m[2, 3] + 30*35*5 = 7875,
3 7875 2625 0 • m[1,2] + m[3, 3] + 30*15*5 = 18000,
)
4 4375 750 0

5 2500 1000 0

6 3500 5000 0
Matrix Chain Multiplication
𝐴1 𝐴2 𝐴3 𝐴4 𝐴5 𝐴6
30 * 35 35 * 15 15 * 5 5 *10 10 * 20 20 * 25
j\i 1 2 3 4 5 6
1 0

2 15750 0 m[2, 5] = min(


• m[2,2] + m[3, 5] + 35*15*20 = 13000,
3 7875 2625 0 • m[2,3] + m[4, 5] + 35*5*20 = 7125,
• m[2,4] + m[5, 5] + 35*10*20 = 11375
)
4 9375 4375 750 0

5 11875 7125 2500 1000 0

6 15125 10500 5375 3500 5000 0


Elements of dynamic programming (Revisit)
1. Optimal substructure.
• Optimal solutions to sub-problems can be used to find the
optimal solution of the original problem.

2. Overlapping subproblems.
• The subproblems show up again and again
Optimal Substructure Property
• Solution to sub-problems are included in the optimal solution

• Rod-cutting
• Solution to smaller pieces are also part of the solution to the entire rod

• Matrix Chain Multiplication


• Solution to 𝐴𝑖:𝑘 and 𝐴𝑘+1:𝑗 is exactly include in the solution to 𝐴𝑖:𝑗

• How to prove this?


• Cut-and-paste
Longest Common Subsequence
• A strand of DNA consists of a string of molecules called bases
• Adenine, Cytosine, Guanine, and Thymine
• ACGT

• S1 = ACCGGTCGAGTGCGCGGAAGCCGGCCGAA
• S2 = GTCGTTCGGAATGCCGTTGCTCTGTAAA
Longest Common Subsequence
• A strand of DNA consists of a string of molecules called bases
• Adenine, Cytosine, Guanine, and Thymine
• ACGT

• Given a sequence 𝑋 = < 𝑥1 , 𝑥2 , … , 𝑥𝑛 >


• Another sequence 𝑍 = < 𝑧1 , 𝑧2 , … , 𝑧𝑛 > is a subsequence of X
• If there exists a strictly increasing sequence < 𝑖1 , 𝑖2 , … , 𝑖𝑛 > indices of X
such that for all 𝑗 = 1, 2, … , 𝑘, 𝑥𝑖𝑗 = 𝑧𝑗
Longest Common Subsequence
• A strand of DNA consists of a string of molecules called bases
• Adenine, Cytosine, Guanine, and Thymine
• ACGT

• Given two sequences 𝑋 and 𝑌


• A sequence Z is a common sub-sequence if Z is a subsequence of both X
and Y

• Goal: find a maximum-length common subsequence of X and Y.


Longest Common Subsequence
• Need to consider all subsequences

• How many subsequences of X?


• 2𝑛
Longest Common Subsequence
• Given a sequence 𝑋 = < 𝑥1 , 𝑥2 , … , 𝑥𝑛 >
• The 𝑖𝑡ℎ prefix of 𝑋 is 𝑋𝑖 = < 𝑥1 , 𝑥2 , … , 𝑥𝑖 >
Longest Common Subsequence
• Let 𝑐[𝑖, 𝑗] = the length of an LCS of the sequences 𝑋𝑖 and 𝑌𝑗
Longest Common Subsequence
Longest Common Subsequence
j 0 1 2 3 4 5 6
i B D C A B A
0 0 0 0 0 0 0 0
1 A 0 0 0 0 1
2 B 0
3 C 0
4 B 0
5 D 0
6 A 0
7 B 0
Longest Common Subsequence
j 0 1 2 3 4 5 6
i B D C A B A
0 0 0 0 0 0 0 0
1 A 0 0 0 0 1 1 1
2 B 0 1 1 1 1 2 2
3 C 0 1 1 2 2 2 2
4 B 0 1 1 2 2 3 3
5 D 0 1 2 2 2 3 3
6 A 0 1 2 2 3 3 4
7 B 0 1 2 2 3 4 4
Sequence Alignment
• How similar are two strings?
• ocurrance vs occurrence
Sequence Alignment
• Edit distance
• Gap penalty and mismatch penalty
• Cost is sum of all penalties
Sequence Alignment
• Given two sequences
• 𝑥1 𝑥2 … 𝑥𝑛 and 𝑦1𝑦2 … 𝑦𝑛
• An alignment is a set of ordered pairs (𝑥𝑖 , 𝑦𝑗 ) such that each
character appears in at most one pair and there are no crossings.
• Crossing: (𝑥𝑖 , 𝑦𝑗 ) and (𝑥𝑖′ , 𝑦𝑗′ ) cross if 𝑖 < 𝑖′ but 𝑗 > 𝑗′
Sequence Alignment
• Given two sequences
• 𝑥1 𝑥2 … 𝑥𝑛 and 𝑦1 𝑦2 … 𝑦𝑛

• The cost of an alignment M is,

𝑐𝑜𝑠𝑡 𝑀
= ෍ 𝑐𝑜𝑠𝑡𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ + ෍ 𝑐𝑜𝑠𝑡𝑔𝑎𝑝 + ෍ 𝑐𝑜𝑠𝑡𝑔𝑎𝑝
𝑥𝑖 ,𝑦𝑗 ∈𝑀 𝑥𝑖 𝑦𝑗
𝑢𝑛𝑚𝑎𝑡𝑐ℎ𝑒𝑑 𝑢𝑛𝑚𝑎𝑡𝑐ℎ𝑒𝑑
Sequence Alignment
• Given two sequences
• 𝑥1 𝑥2 … 𝑥𝑛 and 𝑦1𝑦2 … 𝑦𝑛

• Goal: Find minimum cost alignment of the two sequences


Sequence Alignment
• Given two sequences
• 𝑥1 𝑥2 … 𝑥𝑛 and 𝑦1𝑦2 … 𝑦𝑛

• 𝑂𝑃𝑇 𝑖, 𝑗 = minimum cost of aligning 𝑥1 𝑥2 … 𝑥𝑖 and 𝑦1 𝑦2 … 𝑦𝑗


Sequence Alignment
• 𝑂𝑃𝑇 𝑖, 𝑗 = minimum cost of aligning 𝑥1 𝑥2 … 𝑥𝑖 and 𝑦1 𝑦2 … 𝑦𝑗
• Case 1: 𝑂𝑃𝑇 𝑖, 𝑗 matches (𝑥𝑖 , 𝑦𝑗 )
• Pay mismatch for (𝑥𝑖 , 𝑦𝑗 ) + min cost of aligning 𝑥1 𝑥2 … 𝑥𝑖−1 and 𝑦1 𝑦2 … 𝑦𝑗−1
• 𝑐𝑜𝑠𝑡𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ + 𝑂𝑃𝑇 𝑖 − 1, 𝑗 − 1
Sequence Alignment
• 𝑂𝑃𝑇 𝑖, 𝑗 = minimum cost of aligning 𝑥1 𝑥2 … 𝑥𝑖 and 𝑦1 𝑦2 … 𝑦𝑗
• Case 2: 𝑂𝑃𝑇 𝑖, 𝑗 leaves 𝑥𝑖 unmatched
• Pay gap for 𝑥𝑖 + min cost of aligning 𝑥1 𝑥2 … 𝑥𝑖−1 and 𝑦1 𝑦2 … 𝑦𝑗
• 𝑐𝑜𝑠𝑡𝑔𝑎𝑝 + 𝑂𝑃𝑇 𝑖 − 1, 𝑗
Sequence Alignment
• 𝑂𝑃𝑇 𝑖, 𝑗 = minimum cost of aligning 𝑥1 𝑥2 … 𝑥𝑖 and 𝑦1 𝑦2 … 𝑦𝑗
• Case 3: 𝑂𝑃𝑇 𝑖, 𝑗 leaves 𝑦𝑗 unmatched
• Pay gap for 𝑦𝑗 + min cost of aligning 𝑥1 𝑥2 … 𝑥𝑖 and 𝑦1 𝑦2 … 𝑦𝑗−1
• 𝑐𝑜𝑠𝑡𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ + 𝑂𝑃𝑇 𝑖, 𝑗 − 1
Sequence Alignment
• 𝑂𝑃𝑇 𝑖, 𝑗 = minimum cost of aligning 𝑥1 𝑥2 … 𝑥𝑖 and 𝑦1 𝑦2 … 𝑦𝑗
• Case 1: 𝑂𝑃𝑇 𝑖, 𝑗 matches (𝑥𝑖 , 𝑦𝑗 )
• 𝑐𝑜𝑠𝑡𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ + 𝑂𝑃𝑇 𝑖 − 1, 𝑗 − 1

• Case 2: 𝑂𝑃𝑇 𝑖, 𝑗 leaves 𝑥𝑖 unmatched


• 𝑐𝑜𝑠𝑡𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ + 𝑂𝑃𝑇 𝑖 − 1, 𝑗

• Case 3: 𝑂𝑃𝑇 𝑖, 𝑗 leaves 𝑦𝑗 unmatched


• 𝑐𝑜𝑠𝑡𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ + 𝑂𝑃𝑇 𝑖, 𝑗 − 1
Sequence Alignment
• 𝑂𝑃𝑇 𝑖, 𝑗 = minimum cost of aligning 𝑥1 𝑥2 … 𝑥𝑖 and 𝑦1 𝑦2 … 𝑦𝑗
Sequence Alignment
0-1 Knapsack Problem
• Given,
• n items where item 𝑖 has value
𝑣𝑖 > 0 and weighs 𝑤𝑖 > 0
• Value of a subset of items = sum
of values of individual items.
• Knapsack has weight limit of 𝑊
• Goal. Pack knapsack so as to
maximize total value of items
taken.
• Example, {1, 2, 5} yields 35$
while {3, 4} yields 40$.
0-1 Knapsack Problem
• 𝑂𝑃𝑇(𝑖, 𝑤) = optimal value of knapsack problem with items 1, … , 𝑖,
subject to weight limit 𝑤
• Goal: Find 𝑂𝑃𝑇(𝑛, 𝑊)
0-1 Knapsack Problem
• 𝑂𝑃𝑇(𝑖, 𝑤) = optimal value of knapsack problem with items 1, … , 𝑖,
subject to weight limit 𝑤

• Case 1: Don’t pick item 𝑖


• 𝑂𝑃𝑇(𝑖, 𝑤) selects best of {1, 2, … , 𝑖 – 1} subject to weight limit 𝑤

• Case 2: Pick item 𝑖


• 𝑂𝑃𝑇(𝑖, 𝑤) selects best of {1, 2, … , 𝑖 – 1} subject to weight limit 𝑤 − 𝑤𝑖
• Collect value 𝑣𝑖
0-1 Knapsack Problem
• 𝑂𝑃𝑇(𝑖, 𝑤) = optimal value of knapsack problem with items 1, … , 𝑖,
subject to weight limit 𝑤

• Case 1: Don’t pick item 𝑖


• 𝑂𝑃𝑇 𝑖, 𝑤 = 𝑂𝑃𝑇(𝑖 − 1, 𝑤)

• Case 2: Pick item 𝑖


• 𝑂𝑃𝑇 𝑖, 𝑤 = 𝑣𝑖 + 𝑂𝑃𝑇(𝑖 − 1, 𝑤)
0-1 Knapsack Problem
• 𝑂𝑃𝑇(𝑖, 𝑤) = optimal value of knapsack problem with items 1, … , 𝑖,
subject to weight limit 𝑤
0-1 Knapsack Problem
0-1 Knapsack Problem
Reference
• Dynamic Programming
• CLRS 4th Ed. Chapter 14 (14.1 – 14.4)
• KT Sections 6.4 (The Knapsack Problem), 6.6
Greedy Algorithms
Divide and Conquer (Recap)

Big problem

sub-problem sub-problem

sub-sub- sub-sub- sub-sub- sub-sub- sub-sub-


problem problem problem problem problem
Dynamic Programming (Recap)

Big problem

sub-problem sub-problem sub-problem

sub-sub- sub-sub- sub-sub- sub-sub-


problem problem problem problem
Greedy Algorithms
• Make greedy choices one-at-a-time.
• Never look back.
• Hope (prove) that the greedy choice leads to optimal solution.
Activity Selection Problem
• Activities are competing for an exclusive access to a common
resource, i.e., conference room
• A set 𝑆 = {𝑎1 , 𝑎2, … , 𝑎𝑛 } of n activities requires the same resource
• The resource can serve only one activity at a time
• Each activity 𝑎𝑖 has a start time 𝑠𝑖 and a finish time 𝑓𝑖

• 𝑎𝑖 and 𝑎𝑗 are compatible if 𝑠𝑖 ≥ 𝑓𝑗 or 𝑠𝑗 ≥ 𝑓𝑖


• Starts after the other finishes
Activity Selection Problem
• Activities are competing for an exclusive access to a common
resource, i.e., conference room
• A set 𝑆 = {𝑎1 , 𝑎2, … , 𝑎𝑛 } of n activities requires the same resource
• The resource can serve only one activity at a time
• Each activity 𝑎𝑖 has a start time 𝑠𝑖 and a finish time 𝑓𝑖

• Goal:
• Schedule maximum-size subset of mutually compatible activities
Activity Selection Problem
• Assumption: The activities are sorted by their finish times
𝑓1 ≤ 𝑓2 ≤ 𝑓3 ≤ … ≤ 𝑓𝑛

• If not, sort with O(𝑛 𝑙𝑔𝑛)


Activity Selection Problem
• Does this have optimal substructure property?

• Let 𝑆𝑖𝑗 be the set of activities that


• Start after activity 𝑎𝑖 finishes and that finish before activity 𝑎𝑗 starts.

• If 𝑎𝑘 ∈ 𝑆𝑖𝑗 is included in the optimal solution,


• Solve activity selection problem on 𝑆𝑖𝑘 and 𝑆𝑘𝑗
Activity Selection Problem
• Does this have optimal substructure property?

• If 𝐴𝑖𝑗 is the maximum-size compatible subset of activities in 𝑆𝑖𝑗


𝐴𝑖𝑗 = 𝐴𝑖𝑘 ∪ 𝑎𝑘 ∪ 𝐴𝑘𝑗
|𝐴𝑖𝑗 = |𝐴𝑖𝑘 + 1 + |𝐴𝑘𝑗 |
• Let, 𝑐 𝑖, 𝑗 denote the size of an optimal solution for the set 𝑆𝑖𝑗
Activity Selection Problem
• Does this have optimal substructure property?
• Yes!!!

• Can also use cut-and-paste argument for proof

• DP??
Activity Selection Problem
• Let’s make a greedy choice

• Choose the activity that leaves the resource available for as many
other activities as possible

• Any optimal solution has an activity that finishes first

• Here, choose the activity in S with the earliest finish time,


• Leave the resource available for maximum number of other activities
Activity Selection Problem
• Let’s make a greedy choice
• Choose the activity with earliest finish time

• Once a greedy choice is made,


• There is only one remaining subproblem to solve
• Solve for the activities that starts after the first-choice finishes
Activity Selection Problem
• Let 𝑆𝑘 = {𝑎𝑖 ∈ 𝑆; 𝑠𝑖 ≥ 𝑓𝑘 }

• Once we have picked 𝑎𝑘 , all we need to solve is 𝑆𝑘


• Optimal Substructure Property
Activity Selection Problem
• The greedy choice
• Choose the activity with earliest finish time
• Is it optimal?
Activity Selection Problem
Activity Selection Problem
Activity Selection Problem
Activity Selection Problem
Activity Selection Problem
Elements of Greedy Strategy
• Optimal Substructure property

• Greedy Choice Property


• Assemble a globally optimal solution by making locally optimal
(greedy) choices.

• Make the choice that looks best in the current problem, without
considering results from subproblems.
Elements of Greedy Strategy

Big problem

sub-problem

sub-sub-
problem
Knapsack Problems
• 0-1 Knapsack Problem
• Choice at each step
• The choice usually depends on the solutions to subproblems.

• Fractional Knapsack Problem


• Pick the item with maximum vi /𝑤𝑖 ratio
• Repeat as long as
• The supply is not exhausted
• The thief can still carry more,
Knapsack Problems
• W = 50
• w = [10, 20, 30], v = [60, 100, 120]

• 0-1 Knapsack Problem


• Pick item 2 and 3

• Fractional Knapsack Problem


• Pick item 1, 2 and portion of 3
Single Source Shortest Path
• Given
• A weighted and directed graph, G = (V, E) What about unweighted graphs?
• A source, s

• Goal: Find out the shortest path weight from the source to a given
node / other nodes.
Single Source Shortest Path
• Subpaths of shortest paths are shortest paths
• Proof:
• Decompose the shortest path into smaller sub-paths
p = 𝑣1 ∼ 𝑣𝑖 ∼ 𝑣𝑗 ∼ 𝑣𝑘
• The subpaths are 𝑝0𝑖 , 𝑝𝑖𝑗 , 𝑝𝑗𝑘
• Assume there exists 𝑝𝑖𝑗 ′ such that 𝑤(𝑝𝑖𝑗 ′) < 𝑤(𝑝𝑖𝑗 )
• Then replacing 𝑝𝑖𝑗 with 𝑝𝑖𝑗 ′ gives a shorter path

• Optimal Substructure Property


Single Source Shortest Path
• Negative-weight Edges
• Graph may contain negative weight cycles reachable from source s
• Shortest path problem is not well-defined

• Cycle
• Shortest path cannot contain cycle
• Just dropping the cycle gives a lower cost path
Dijkstra’s Algorithm
• Given
• A weighted and directed graph, 𝐺 = (𝑉, 𝐸)
• A source, 𝑠
• Non-negative weights on edges, 𝑤 𝑢, 𝑣 ≥ 0

• Goal: Find out the shortest path weight from the source to a given
node / other nodes.
Dijkstra’s Algorithm
• A path weight of a, 𝑝 = < 𝑣1 , 𝑣2 , … , 𝑣𝑘 >, is

• Shortest path weight is defined as,


Dijkstra’s Algorithm
• Relaxation
• 𝑥. 𝑑 best known estimate of the shortest distance from s to x
v.d = min(v.d , u.d + w(u,v))
Dijkstra’s Algorithm
• Relaxation
• 𝑥. 𝑑 best known estimate of the shortest distance from s to x
v.d = min(v.d , u.d + w(u,v))

• Also keep track of the predecessor


• The node immediately before v on the shortest path from s to v
Dijkstra’s Algorithm
• Initialization
Dijkstra’s Algorithm
• Remember BFS?
• At each step, pick the next node from discovered nodes in the queue

• Dijkstra’s Algorithm The greedy choice!!!!


• Similar to BFS
• Except that at each step, pick the next node with the minimum estimated
shortest-path weight
• Replace the FIFO queue of BFS with a minimum priority-queue
Dijkstra’s Algorithm
Dijkstra’s Algorithm
Dijkstra’s Algorithm
Dijkstra’s Algorithm
• Correctness of Dijkstra’s Algorithm
• S is the set of visited nodes
• The algorithm terminates when 𝑆 = 𝑉

• Inductive Hypothesis:
• At the start of each iteration, 𝑣 = 𝛿(𝑠, 𝑣) for all 𝑣 ∈ 𝑆
Dijkstra’s Algorithm
• Correctness of Dijkstra’s Algorithm
• At some iteration, v is extracted from the priority queue
• y first node not in S on the shortest path P to u
• x predecessor of y on P

𝑦. 𝑑 ≥ 𝑣. 𝑑
𝛿 𝑠, 𝑥 + 𝑤 𝑥, 𝑦 ≥ 𝛿 𝑠, 𝑢 + 𝑤(𝑢, 𝑣)

𝑤 𝑃 ≥ 𝑤(𝑠 ∼ 𝑣)
Minimum Spanning Tree
• A spanning tree is a tree that connects all of the vertices.
• The cost of a spanning tree is the sum of the weights on the edges.

B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F
Minimum Spanning Tree
• A spanning tree is a tree that connects all of the vertices.
• The cost of a spanning tree is the sum of the weights on the edges.
8 7
B C D
It has cost 67
4 9
2
11 4
A I 14 E

7 6
8 10
A tree is a
1 2 connected graph
H G F
with no cycles!
Minimum Spanning Tree
• A minimum spanning tree is a tree with minimum cost that
connects all of the vertices.
8 7
B C D
It has cost 37
4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F
Some Definitions
• An (𝑆, 𝑉 − 𝑆) partition is a cut of 𝑉 of an undirected graph 𝐺 =
(𝑉, 𝐸)

• Edge (𝑢, 𝑣) crosses a cut if 𝑢 ∈ 𝑆 and v ∈ 𝑉 − 𝑆

• An edge is a light edge crossing a cut if its weight is the minimum


of any edge crossing the cut.
Minimum Spanning Tree
• The strategy:
• Make a series of greedy choices, adding edges to the tree.
• Show that each edge we add is safe to add:
• we do not rule out the possibility of success
• Keep going until we have an MST.
Minimum Spanning Tree
• Greedy strategy 1 (Prim’s Algorithm):
• Start from an empty tree (a cut)
• At each step, grow the tree (a cut) with a node that can be connected with
minimum cost (i.e., grow the tree by adding the node on the other end of
the light edge)
• Terminate when all nodes are included in the tree
Prim’s Algorithm

8 7
Root Node B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

44
Prim’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

45
Prim’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

46
Prim’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

47
Prim’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

48
Prim’s Algorithm
Prim’s Algorithm
• Proof of correctness?
• Later
Minimum Spanning Tree
• Greedy Strategy 2 (Kruskal’s Algorithm):
• Start with each node as a separate tree
• Consider the edges in ascending order of their weights
• Include the minimum weight edge between two disjoint trees to connect
them into a single tree
• Discard the edge if it creates a cycle
• Terminate when all the nodes are included
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

52
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

53
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

54
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

55
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

56
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

57
Kruskal’s Algorithm

Causing a cycle
8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

58
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

59
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

60
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

61
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

62
Kruskal’s Algorithm
Kruskal’s Algorithm
Proof of Correctness?
• Later
Cut Property
• Assume that all edge costs are distinct.
• Let S be any subset of nodes that is neither empty nor equal to all
of V, and let edge e =(v, w) be the minimum cost edge with one end
in S and the other in V −S.

• Then every minimum spanning tree contains the edge e


Correctness
• Kruskal’s Algorithm
• Apply cut property

• Prim’s Algorithm
• Apply cut property
Reference
• Greedy Algorithms
• CLRS 4th ed. Sections 15.1, 15.2

• Dijkstra’s Algorithm
• CLRS 4th ed. Sections 22 (intro), 22.3

• Minimum Spanning Tree


• CLRS 4th ed. Sections 21.1, 21.2
• KT Section 4.5 (Correctness)
Disjoint Set Operations
Kruskal’s Algorithm

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

2
Union-Find
• Make-Set(x)
• Creates a new set whose only
member is x
• O(1) 8 7
B C D

4 9
• MakeUnionFind(S) 2
11 4
• MakeSet(x) with each v \in V A I 14 E

8 7 6
10
H
1 G 2 F

3
Union-Find
• Find(x)
• Return a pointer to the
representative/name of the set
containing x 8 7
B C D
• Find(C) = Red/ Pointer to C or I
4 9
2
11 4
A I 14 E

8 7 6
10
H
1 G 2 F

4
Union-Find
• Find(x)
• Return a pointer to the
representative/name of the set
containing x 8 7
B C D
• Find(C) = Red/ Pointer to C or I
4 9
2
11 4
• Union(x, y) A I 14 E
• Unites two disjoint, dynamic
sets that contain x and y, say 𝑆𝑥 8 7 6
10
and 𝑆𝑦 1 2
H G F

5
Union-Find
• Maintain an array Component
• Contains the name of the set
currently containing each
element. 8 7
B C D

4 9
• Name of the set 2
11 4
• Pointer to a representative node A I 14 E
• Name of the representative node
8 7 6
10
H
1 G 2 F

6
Union-Find
Components
1 1 3 4 5 6 6 6 3
• Maintain an array Component 1 2 3 4 5 6 7 8 9
• Contains the name of the set
currently containing each
element. 8 7
2 3 4

4 9
• Name of the set 2
11 4
• Pointer to a representative node 1 9 14 5
• Name of the representative node
• Index of the representative node 8 7 6
10
8
1 7 2 6

7
Union-Find
Components
1 2 3 4 5 6 7 8 9
• Initially set 𝐶𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡[𝑠] = 𝑠 1 2 3 4 5 6 7 8 9
for all s.
8 7
2 3 4

4 9
2
11 4
1 9 14 5

8 7 6
10
8
1 7 2 6

8
Union-Find
Components
1 1 3 4 5 6 7 8 9
• Initially set 𝐶𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡[𝑠] = 𝑠 1 2 3 4 5 6 7 8 9
for all s.
8 7
2 3 4
• Union(x, y) merges two
4 9
disjoint sets together 2
• Update the values of 11 4
𝐶𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡[𝑠] for all elements 1 9 14 5

in sets A and/or B
8 7 6
10
8
1 7 2 6

9
Union-Find
Components
1 1 3 4 5 6 6 6 3
• Initially set Component[s]=s 1 2 3 4 5 6 7 8 9
for all s
8 7
2 3 4
• Union(x, y) merges two disjoint
4 9
sets together 2
• Update the values of 11 4
Component[s] for all elements in 1 9 14 5

sets A and/or B
8 7 6
10
8
1 7 2 6

10
Union-Find
Components
1 1 3 4 5 3 3 3 3
• Initially set Component[s]=s 1 2 3 4 5 6 7 8 9
for all s
8 7
2 3 4
• Union(x, y) merges two disjoint
4 9
sets together 2
• Update the values of 11 4
Component[s] for all elements in 1 9 14 5

sets A and B
8 7 6
• Scan all the components 10
1 2
• Can take 𝑂(𝑛) 8 7 6

11
Union-Find
Components
1 1 3 4 5 3 3 3 3
• Find(x) 1 2 3 4 5 6 7 8 9
• Return Components[x]
• Takes 𝑂(1)
8 7
2 3 4

4 9
2
11 4
1 9 14 5

8 7 6
10
8
1 7 2 6

12
Union-Find
Components
1 1 3 4 5 3 3 3 3
• Optimizations to improve the 1 2 3 4 5 6 7 8 9
Union(x, y)
• Maintain the list of 8 7
elements in each 2 3 4
component 4 9
2
• Only update the elements in 11 4
the smaller set; Keep the 1 9 14 5
name of the larger set
8 7 6
10
1 2
• Still 𝑂(𝑛) 8 7 6

13
Union-Find
• Any sequence of k Union operations takes at most 𝑂(𝑘 log 𝑘) time
• Touches at most 2k elements of S
• A node v’s set grows after each Union operation
• Either Component[v] remains unchanged, or it is updated
• If updated the size of v’s set doubles
• There can be at most log(2𝑘) updates to Component[v]
• For 2k node, there can be at most 𝑂 𝑘𝑙𝑜𝑔𝑘 updates.
A Better Union-Find Parents

1 2 3 4 5 6 7 8 9
• Each node v will point to the 1 2 3 4 5 6 7 8 9
representative node of its set.
8 7
2 3 4
• MakeUnionFind(S) initializes a
4 9
record for each element v with 2
a pointer that points to itself 11 4
1 9 14 5
• To indicate that v is in its own
set. 7 6
8 10
8
1 7 2 6
A Better Union-Find
• Consider a Union(x, y)
• Set either x or y be the name of the combined set (preferably from the
larger set)
• Assume we select y as the name.
• Simply update x’s pointer to point to y.
• We do not update the pointers at the other nodes in x’s set.
A Better Union-Find Parents

1 2 3 4 5 6 8 8 9
• Consider a Union(x, y) 1 2 3 4 5 6 7 8 9
• Set either x or y be the name of
the combined set
• Assume we select y as the 8 7
2 3 4
name.
4 9
• Simply update x’s pointer to 2
point to y. 11 4
1 9 14 5
• We do not update the pointers at
the other nodes in x’s set. 7 6
8 10
8
1 7 2 6
A Better Union-Find Parents

1 2 3 4 5 6 8 8 3
• Consider a Union(x, y) 1 2 3 4 5 6 7 8 9
• The idea is to have either x or y
be the name of the combined set
• Assume we select y as the 8 7
2 3 4
name.
4 9
• Simply update x’s pointer to 2
point to y. 11 4
1 9 14 5
• We do not update the pointers at
the other nodes in x’s set. 7 6
8 10
8
1 7 2 6
A Better Union-Find Parents

1 2 3 4 5 7 8 8 3
• Consider a Union(x, y) 1 2 3 4 5 6 7 8 9
• The idea is to have either x or y
be the name of the combined set
• Assume we select y as the 8 7
2 3 4
name.
4 9
• Simply update x’s pointer to 2
point to y. 11 4
1 9 14 5
• We do not update the pointers at
the other nodes in x’s set. 7 6
8 10
8
1 7 2 6
A Better Union-Find Parents

1 1 3 4 5 7 8 8 3
• Consider a Union(x, y) 1 2 3 4 5 6 7 8 9
• The idea is to have either x or y
be the name of the combined set
• Assume we select y as the 8 7
2 3 4
name.
4 9
• Simply update x’s pointer to 2
point to y. 11 4
1 9 14 5
• We do not update the pointers at
the other nodes in x’s set. 7 6
8 10
8
1 7 2 6
A Better Union-Find Parents

1 1 6 4 5 7 8 8 3
• Consider a Union(x, y) 1 2 3 4 5 6 7 8 9
• The idea is to have either x or y
be the name of the combined set
• Assume we select y as the 8 7
2 3 4
name.
4 9
• Simply update x’s pointer to 2
point to y. 11 4
1 9 14 5
• We do not update the pointers at
the other nodes in x’s set. 7 6
8 10
8
1 7 2 6
A Better Union-Find
Parents

1 1 6 4 5 7 8 8 3
• Union(x, y) 1 2 3 4 5 6 7 8 9
• Takes O(1)
8 7
2 3 4
• Find(x)
• Cannot simply return Parents[s] 4 9
2
• Traverse through the pointers to 11 4
the top 1 9 14 5
• No longer O(1)
8 7 6
10
8
1 7 2 6
A Better Union-Find
• Find operation takes O(log n) time
• Every time the name of the set containing node v changes, the size of this
set at least doubles.
• There can be at most n nodes in a set
• There can be at most name log 𝑛 changes
• Find operation has 𝑂(log 𝑛) complexity
A Better Union-Find
def MakeUnionFind(n) def find(x):
for i = 1 to n if parent[x] == x
parent[i] = i return parent[x]
else
def Union(x, y): return find(parent[x])
# Assuming x and y are
# from two disjoint sets.
if x’s set is larger
parent[y] = x
else
parent[x] = y
A Better Union-Find with Path Compression
def MakeUnionFind(n) def find(x):
for i = 1 to n if parent[x] == x
parent[i] = i return x
else
def Union(x, y): parent[x] = find(parent[x])
# Assuming x and y are return parent[x]
# from two disjoint sets.
if x’s set is larger
parent[y] = x
else
parent[x] = y
Reference
• Union-Find
• KT Section 4.6
Sorting In Linear Time
Sorting
• Sort n integers in Ο(𝑛 𝑙𝑜𝑔𝑛)
• Merge Sort, Heap Sort in the worst case
• Quick Sort on average

• Compare elements to determine the sorted order


• Called comparison sorts
Lower Bounds for Sorting
• Comparison sorts gains order information from comparisons
• Doesn’t inspect values
• Given 𝑎𝑖 and 𝑎𝑗 , test 𝑎𝑖 < 𝑎𝑗 , 𝑎𝑖 ≤ 𝑎𝑗 , 𝑎𝑖 = 𝑎𝑗 , 𝑎𝑖 ≥ 𝑎𝑗 , or 𝑎𝑖 > 𝑎𝑗

• For lower bound analysis,


• All elements are distinct
Lower Bounds for Sorting
• The decision tree model
• A full binary-tree
• Node is either a leaf or has both a >=b
children
• Node represents the Yes No
comparisons between two Do
elements Do
Something
Something
Else
Lower Bounds for Sorting
• The decision tree model
• Leaf indicates a permutation of n
elements
• Internal nodes are annotated as
𝑖: 𝑗
• Internal node 𝑖: 𝑗 indicates a
comparison a𝑖 ≥ 𝑎𝑗
Lower Bounds for Sorting
• Any correct sorting algorithm
must be able to produce each
permutation of its input

• Each of the 𝑛! permutations on


n elements must appear as at
least one of the leaves
Lower Bounds for Sorting
• Worst-case comparisons
• Longest simple path length from
the root to any of its reachable
leaves
• Equals the height of the decision
tree

• Can we estimate a bound of


the height of such decision
trees?
Lower Bounds for Sorting
• Any comparison sort algorithm requires Ω(𝑛 log 𝑛) comparisons in
the worst case.
• Proof:
• A binary tree of height h has no more than 2ℎ leaves
𝑛! ≤ 2ℎ
Θ(𝑛 log 𝑛) = log 𝑛! ≤ ℎ

• Merge sort and Heapsort are asymptotically optimal algorithms


Counting Sort
• Assumes that inputs are integers in the range 0 to k
• For an input x,
• Determine the number of elements less than or equal to x
• Place element x directly into its position in the output array

• Runs in Θ 𝑛 + 𝑘

• How to handle duplicates?


Counting Sort
Counting Sort
A
2 5 3 0 2 3 0 3

C
2 0 2 3 0 1
Counting Sort
A
2 5 3 0 2 3 0 3

C (old)
2 0 2 3 0 1

C (current state)
2 2 4 7 7 8
Counting Sort
A
2 5 3 0 2 3 0 3

C (current state)
2 2 4 7 7 8

B
Counting Sort
A
2 5 3 0 2 3 0 3

C (current state)
2 2 4 7 7 8

B
3
Counting Sort
A
2 5 3 0 2 3 0 3

C (current state)
2 2 4 6 7 8

B
3
Counting Sort
A
2 5 3 0 2 3 0 3

C (current state)
2 2 4 6 7 8

B
0 3
Counting Sort
A
2 5 3 0 2 3 0 3

C (current state)
1 2 4 6 7 8

B
0 3
Counting Sort
A
2 5 3 0 2 3 0 3

C (current state)
1 2 4 6 7 8

B
0 3 3
Counting Sort
A
2 5 3 0 2 3 0 3

C (current state)
0 2 2 4 7 7

B
0 0 2 2 3 3 3 5
Counting Sort
• Counting sort is stable
• Elements with the same value appear in the output array in the same
order as they do in the input array.
Radix Sort
• We can sort the elements
• By most significant (leftmost) digit
• Then sort the resulting bins recursively
• Finally combine the bins in order.
• Generates many intermediate bins to track
Radix Sort
• Let’s start by the least significant bit
• For example,
Radix Sort

Radix-sort sorts n d-digit number in 𝑂(𝑑 𝑛 + 𝑘 )


Bucket Sort
• Assumes that the input is uniformly distributed in a range
• Divides the range into n equal-sized subintervals, or buckets
• Distributes the inputs into the buckets
• The inputs are uniformly distributed
• Expected number of input per bucket is low
• Sort the numbers in each bucket
• Go through the buckets in order, listing the elements in each
Bucket Sort
Bucket Sort
• Runs in 𝑂(𝑛)
Reference
• Sorting in Linear Time
• CLRS 4th Ed. Chapter 8

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy