4 Quicksort and Balls in Bins
4 Quicksort and Balls in Bins
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
1 Recap
2 Binary search
3 Quicksort
4 Randomized Quicksort
6 Occupancy problems
Today
1 Recap
2 Binary search
3 Quicksort
4 Randomized Quicksort
6 Occupancy problems
Review of the last lecture
O(nlogb a ) , if a > bk
1 Recap
2 Binary search
3 Quicksort
4 Randomized Quicksort
6 Occupancy problems
Searching a sorted array
I Input:
1. sorted list A of n integers;
2. integer x
I Output:
I index j such that 1 ≤ j ≤ n and A[j] = x; or
I no if x is not in A
Searching a sorted array
I Input:
1. sorted list A of n integers;
2. integer x
I Output:
I index j such that 1 ≤ j ≤ n and A[j] = x; or
I no if x is not in A
I Input:
1. sorted list A of n integers;
2. integer x
I Output:
I index j such that 1 ≤ j ≤ n and A[j] = x; or
I no if x is not in A
Idea: use the fact that the array is sorted and probe specific
entries in the array.
Binary search
First, probe the middle entry. Let mid = dn/2e.
I If x == A[mid], return mid.
I If x < A[mid] then look for x in A[1, mid − 1];
I Else if x > A[mid] look for x in A[mid + 1, n].
Initially, the entire array is “active”, that is, x might be anywhere in the array.
≤A[mid] ≥A[mid]
1 mid n
Then the active area of the array, where x might be, is to the right of mid.
≤A[mid] ≥A[mid]
1 mid mid+1 n
Binary search pseudocode
1 Recap
2 Binary search
3 Quicksort
4 Randomized Quicksort
6 Occupancy problems
Quicksort facts
≤pivot >pivot
pivot
split
≤pivot >pivot
pivot
1. Pick a pivot item: for simplicity, always pick the last item
of the array as pivot, i.e., pivot = A[right].
I Thus A[right] will be placed in its final location in the
sorted output when Partition returns; it will never be
used (or moved) again until the algorithm terminates.
pivot
If A[j] ≤ pivot
pivot
End of iteration j: A[j] got swapped with A[split+1], split got updated to split+1
pivot
left j right
split
Example: A = {1, 3, 7, 2, 6, 4, 5}, Partition(A, 1, 7)
pivot
unexamined
≤ pivot
beginning of iteration j=1
> pivot 1 3 7 2 6 4 5 A[j]=1
j
pivot
≤ pivot unexamined
1 3
9 7 2 6 4 5 beginning of iteration j=2
A[j]=3
split j
pivot
≤ pivot unexamined
1 3
9 7 2 6 4 5 beginning of iteration j=3
A[j]=7
split j
pivot
≤ pivot >pivot unexamined
beginning of iteration j=4,
1 3
9 7 2 6 4 5
A[j]=2
split j
>pivot pivot
≤ pivot unexamined
beginning of iteration j=5
1 3
9 2 7 6 4 5
A[j]=6
split j
pivot
≤ pivot > pivot
beginning of iteration j=6
1 3
9 2 7 6 4 5 A[j]=4
split j
pivot
≤ pivot > pivot
end of iteration j=6
1 3
9 2 4 6 7 5 A[j]=4
split j
pivot
1 3
9 2 4 5 7 6 end of Partition
split
Pseudocode for Partition
Claim 1.
For lef t ≤ j ≤ right − 1, at the end of loop j,
1. all items in A[lef t, split] are ≤ pivot; and
2. all items in A[split + 1, j] are > pivot
By induction on j.
3. Step: We will show the claim for j + 1. That is, we will show that
after loop j + 1, all items in A[lef t, split] are ≤ pivot and all items in
A[split + 1, j + 1] are > pivot.
Then every Partition splits its input into two lists of almost
equal sizes, thus
Remark 1.
You can show that T (n) = O(n log n) for any splitting where the
two subarrays have sizes αn, (1 − α)n respectively, for constant
0 < α < 1.
Running time of Quicksort: Worst Case
1 Recap
2 Binary search
3 Quicksort
4 Randomized Quicksort
6 Occupancy problems
Two views of randomness in computation
1 Recap
2 Binary search
3 Quicksort
4 Randomized Quicksort
6 Occupancy problems
Discrete random variables
Then
Pr[X = 1] = p
Pr[X = 0] = 1 − p
E[X] = 1 · Pr[X = 1] + 0 · Pr[X = 0] = p
Indicator random variables
Fact 2.
If X is an indicator random variable, then E[X] = Pr[X = 1].
Example 2: Bernoulli trials
Pr[X = j]?
Example 2: Bernoulli trials
Pr[X = j]?
n
X
X= Xi , E[Xi ] = p, E[X] =?
i=1
Linearity of expectation
n
X
X= Xi , E[Xi ] = p, E[X] =?
i=1
n
X
X= Xi , E[Xi ] = p, E[X] =?
i=1
n
X
X= Xi , E[Xi ] = p, E[X] =?
i=1
n
hX i n
X n
X
E[X] = E Xi = E[Xi ] = p = np.
i=1 i=1 i=1
Today
1 Recap
2 Binary search
3 Quicksort
4 Randomized Quicksort
6 Occupancy problems
Balls in bins problems
Q1: How many balls can we throw before it is more likely than
not that some bin contains at least two balls?
k2 √ √
Requiring e− 2n < 1/2 yields k > n · 2 ln 2 = Ω( n).
I Q4: What is the load of the fullest bin with high probability?
Since bins are symmetric, the expected load of any bin is m/n.
Expected # empty bins
n m
X 1
E[Y ] = E[Yj ] = 1− ≈ ne−m/n
n
j=1
Maximum load with high probability (case m = n)
Proposition 2.
When throwing n balls into n bins uniformly and independently
at random, the maximum load in any bin is Θ(ln n/ ln ln n) with
probability close to 1 as n grows large.
1. Upper bound the probability that any bin contains more than k
n Pn 1 ` n−`
n
1 − n1
P
balls by a union bound: ` n .
j=1 `=k