33 - BD - Data Structures and Algorithms - Narasimha Karumanchi
33 - BD - Data Structures and Algorithms - Narasimha Karumanchi
series
Geometric series
Harmonic series
All divide and conquer algorithms (also discussed in detail in the Divide and Conquer chapter)
divide the problem into sub-problems, each of which is part of the original problem, and then
perform some additional work to compute the final answer. As an example, a merge sort
algorithm [for details, refer to Sorting chapter] operates on two sub-problems, each of which is
half the size of the original, and then performs O(n) additional work for merging. This gives the
running time equation:
The following theorem can be used to determine the running time of divide and conquer
algorithms. For a given program (algorithm), first we try to find the recurrence relation for the
problem. If the recurrence is of the below form then we can directly give the answer without fully
solving it. If the recurrence is of the form , where a ≥ 1,b >
1,k ≥ 0 and p is a real number, then:
1) If a > bk , then
2) If a= bk
a. If p > –1, then
b. If p = –1, then
c. If p < –1, then
3) If a < bk
a. If p ≥ 0, then T(n) = Θ(nk logpn)
b. If p < 0, then T(n) = O(nk )
For each of the following recurrences, give an expression for the runtime T(n) if the recurrence
can be solved with the Master Theorem. Otherwise, indicate that the Master Theorem does not
apply.
Problem-1 T(n) = 3T (n/2) + n2
Solution: T(n) = 3T (n/2) + n2 => T (n) =Θ(n2) (Master Theorem Case 3.a)
Problem-2 T(n) = 4T (n/2) + n2
Solution: T(n) = 4T (n/2) + n2 => T (n) = Θ(n2logn) (Master Theorem Case 2.a)
Problem-3 T(n) = T(n/2) + n2
Solution: T(n) = T(n/2) + n2 => Θ(n2) (Master Theorem Case 3.a)
Problem-4 T(n) = 2nT(n/2) + nn
Solution: T(n) = 2nT(n/2) + nn => Does not apply (a is not constant)
Problem-5 T(n) = 16T(n/4) + n
Solution: T(n) = 16T (n/4) + n => T(n) = Θ(n2) (Master Theorem Case 1)
Problem-6 T(n) = 2T(n/2) + nlogn
Solution: T(n) = 2T(n/2) + nlogn => T(n) = Θ(nlog2n) (Master Theorem Case 2.a)
Problem-7 T(n) = 2T(n/2) + n/logn
Solution: T(n) = 2T(n/2)+ n/logn =>T(n) = Θ(nloglogn) (Master Theorem Case 2. b)
Problem-12 T(n) = 7T(n/3) + n2
Solution: T(n) = 7T(n/3) + n2 => T(n) = Θ(n2) (Master Theorem Case 3.as)
Problem-13 T(n) = 4T(n/2) + logn
Solution: T(n) = 4T(n/2) + logn => T(n) = Θ(n2) (Master Theorem Case 1)
Problem-14 T(n) = 16T (n/4) + n!
Solution: T(n) = 16T (n/4) + n! => T(n) = Θ(n!) (Master Theorem Case 3.a)
Problem-15 T(n) = T(n/2) + logn
Solution: T(n) = T(n/2) + logn => T(n) = Θ( ) (Master Theorem Case 1)
Problem-16 T(n) = 3T(n/2) + n
Solution: T(n) = 3T(n/2) + n =>T(n) = Θ(nlog3) (Master Theorem Case 1)
Problem-17 T(n) = 3T(n/3) +
Solution: T(n) = 3T(n/3) + => T(n) = Θ(n) (Master Theorem Case 1)
Problem-18 T(n) = 4T(n/2) + cn
Solution: T(n) = 4T(n/2) + cn => T(n) = Θ(n2) (Master Theorem Case 1)
Problem-19 T(n) = 3T(n/4) + nlogn
Solution: T(n) = 3T(n/4) + nlogn => T(n) = Θ(nlogn) (Master Theorem Case 3.a)
Problem-20 T (n) = 3T(n/3) + n/2
Solution: T(n) = 3T(n/3)+ n/2 => T (n) = Θ(nlogn) (Master Theorem Case 2.a)
for some constants c,a > 0,b ≥ 0,k ≥ 0, and function f(n). If f(n) is in O(nk ), then
The solution to the equation T(n) = T(α n) + T((1 – α)n) + βn, where 0 < α < 1 and β > 0 are
constants, is O(nlogn).
Now, let us discuss a method which can be used to solve any recurrence. The basic idea behind
this method is:
In other words, it addresses the question: What if the given recurrence doesn’t seem to match with
any of these (master theorem) methods? If we guess a solution and then try to verify our guess
inductively, usually either the proof will succeed (in which case we are done), or the proof will
fail (in which case the failure will help us refine our guess).
As an example, consider the recurrence . This doesn’t fit into the form
required by the Master Theorems. Carefully observing the recurrence gives us the impression that
it is similar to the divide and conquer method (dividing the problem into subproblems each
with size ). As we can see, the size of the subproblems at the first level of recursion is n. So,
let us guess that T(n) = O(nlogn), and then try to prove that our guess is correct.
The last inequality assumes only that 1 ≥ k. .logn. This is incorrect if n is sufficiently large and
for any constant k. From the above proof, we can see that our guess is incorrect for the lower
bound.
From the above discussion, we understood that Θ(nlogn) is too big. How about Θ(n)? The lower
bound is easy to prove directly:
From the above induction, we understood that Θ(n) is too small and Θ(nlogn) is too big. So, we
need something bigger than n and smaller than nlogn. How about ?
The last step doesn’t work. So, Θ( ) doesn’t work. What else is between n and nlogn?
How about nloglogn? Proving upper bound for nloglogn:
From the above proofs, we can see that T(n) ≤ cnloglogn, if c ≥ 1 and T(n) ≥ knloglogn, if k ≤ 1.
Technically, we’re still missing the base cases in both proofs, but we can be fairly confident at
this point that T(n) = Θ(nloglogn).
The motivation for amortized analysis is to better understand the running time of certain
techniques, where standard worst case analysis provides an overly pessimistic bound. Amortized
analysis generally applies to a method that consists of a sequence of operations, where the vast
majority of the operations are cheap, but some of the operations are expensive. If we can show
that the expensive operations are particularly rare we can change them to the cheap operations,
and only bound the cheap operations.
The general approach is to assign an artificial cost to each operation in the sequence, such that the
total of the artificial costs for the sequence of operations bounds the total of the real costs for the
sequence. This artificial cost is called the amortized cost of an operation. To analyze the running
time, the amortized cost thus is a correct way of understanding the overall running time – but note
that particular operations can still take longer so it is not a way of bounding the running time of
any individual operation in the sequence.
Example: Let us consider an array of elements from which we want to find the kth smallest
element. We can solve this problem using sorting. After sorting the given array, we just need to
return the kth element from it. The cost of performing the sort (assuming comparison based sorting
algorithm) is O(nlogn). If we perform n such selections then the average cost of each selection is
O(nlogn/n) = O(logn). This clearly indicates that sorting once is reducing the complexity of
subsequent operations.
Note: From the following problems, try to understand the cases which have different
complexities (O(n), O(logn), O(loglogn) etc.).
Problem-21 Find the complexity of the below recurrence:
Solution: Let us try solving this function with substitution.
T(n) = 3T(n – 1)
Note: We can use the Subtraction and Conquer master theorem for this problem.
Problem-22 Find the complexity of the below recurrence:
∴ Time Complexity is O(1). Note that while the recurrence relation looks exponential, the
solution to the recurrence relation here gives a different result.
Problem-23 What is the running time of the following function?
Solution: Consider the comments in the below function:
We can define the ‘s’ terms according to the relation si = si–1 + i. The value oft’ increases by 1
for each iteration. The value contained in ‘s’ at the ith iteration is the sum of the first ‘(‘positive
integers. If k is the total number of iterations taken by the program, then the while loop terminates
if:
In the above-mentioned function the loop will end, if i2 > n ⇒ T(n) = O( ). This is similar to
Problem-23.
Problem-25 What is the complexity of the program given below: