0% found this document useful (0 votes)
20 views74 pages

4 Quicksort and Balls in Bins

Uploaded by

arastogi1997
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views74 pages

4 Quicksort and Balls in Bins

Uploaded by

arastogi1997
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

Algorithms for Data Science

CSOR W4246

Eleni Drinea
Computer Science Department

Columbia University

Binary search, quicksort, randomized quicksort, occupancy problems


Outline

1 Recap

2 Binary search

3 Quicksort

4 Randomized Quicksort

5 Random variables and linearity of expectation

6 Occupancy problems
Today

1 Recap

2 Binary search

3 Quicksort

4 Randomized Quicksort

5 Random variables and linearity of expectation

6 Occupancy problems
Review of the last lecture

In the last lecture we discussed


I Asymptotic notation (O, Ω, Θ, o, ω)
I The divide & conquer principle
I Divide the problem into a number of subproblems that are
smaller instances of the same problem.
I Conquer the subproblems by solving them recursively.
I Combine the solutions to the subproblems into the
solution for the original problem.
I Application: Mergesort
I Solving recurrences
Mergesort

Mergesort (A, lef t, right)


if right == lef t then
return
end if
mid = lef t + b(right − lef t)/2c
Mergesort (A, lef t, mid)
Mergesort (A, mid + 1, right)
Merge (A, lef t, right, mid)

I Initial call: Mergesort(A, 1, n)


I Subroutine Merge merges two sorted lists of sizes dn/2e,
bn/2c into one sorted list of size n in time Θ(n).
Running time of Mergesort

The running time of Mergesort satisfies:

T (n) = 2T (n/2) + cn, for n ≥ 2, constant c > 0


T (1) = c

This structure is typical of recurrence relations:


I an inequality or equation bounds T (n) in terms of an
expression involving T (m) for m < n
I a base case generally says that T (n) is constant for small
constant n
Remarks
I We ignore floor and ceiling notations
I A recurrence does not provide an asymptotic bound for
T (n): to this end, we must solve the recurrence
Solving recurrences, method 1: recursion trees

The technique consists of three steps


1. Analyze the first few levels of the tree of recursive calls
2. Identify a pattern
3. Sum over all levels of recursion

Example: analysis of running time of Mergesort


T (n) = 2T (n/2) + cn, n ≥ 2
T (1) = c
A frequently occurring recurrence and its solution

The running time of many recursive algorithms is given by


n
T (n) = aT + cnk , for a, c > 0, b > 1, k ≥ 0
b

What is the recursion tree for this recurrence?


I a is the branching factor
I b is the factor by which the size of each subproblem shrinks
⇒ at level i, there are ai subproblems, each of size n/bi
⇒ each subproblem at level i requires c(n/bi )k work
I the height of the tree is logb n levels
log
Pb n
Plogb n a i
ai c(n/bi )k = cnk

⇒ Total work: i=0 bk
i=0
Solving recurrences, method 2: Master theorem

Theorem 1 (Master theorem).


If T (n) = aT (dn/be) + O(nk ) for some constants a > 0, b > 1,
k ≥ 0, then

 O(nlogb a ) , if a > bk

T (n) = O(nk log n) , if a = bk


O(nk ) , if a < bk

Example: running time of Mergesort


I T (n) = 2T (n/2) + cn:
a = 2, b = 2, k = 1, bk = 2 = a ⇒ T (n) = O(n log n)
Today

1 Recap

2 Binary search

3 Quicksort

4 Randomized Quicksort

5 Random variables and linearity of expectation

6 Occupancy problems
Searching a sorted array

I Input:
1. sorted list A of n integers;
2. integer x
I Output:
I index j such that 1 ≤ j ≤ n and A[j] = x; or
I no if x is not in A
Searching a sorted array

I Input:
1. sorted list A of n integers;
2. integer x
I Output:
I index j such that 1 ≤ j ≤ n and A[j] = x; or
I no if x is not in A

Example: A = {0, 2, 3, 5, 6, 7, 9, 11, 13}, n = 9, x = 7


Searching a sorted array

I Input:
1. sorted list A of n integers;
2. integer x
I Output:
I index j such that 1 ≤ j ≤ n and A[j] = x; or
I no if x is not in A

Example: A = {0, 2, 3, 5, 6, 7, 9, 11, 13}, n = 9, x = 7

Idea: use the fact that the array is sorted and probe specific
entries in the array.
Binary search
First, probe the middle entry. Let mid = dn/2e.
I If x == A[mid], return mid.
I If x < A[mid] then look for x in A[1, mid − 1];
I Else if x > A[mid] look for x in A[mid + 1, n].

Initially, the entire array is “active”, that is, x might be anywhere in the array.

≤A[mid] ≥A[mid]

1 mid n

Suppose x > A[mid].

Then the active area of the array, where x might be, is to the right of mid.

≤A[mid] ≥A[mid]

1 mid mid+1 n
Binary search pseudocode

binarysearch(A, lef t, right)


mid = lef t + d(right − lef t)/2e
if x == A[mid] then
return mid
else if right == lef t then
return no
else if x > A[mid] then
lef t = mid + 1
else right = mid − 1
end if
binarysearch(A, lef t, right)

Initial call: binarysearch(A, 1, n)


Binary search running time

Observation: At each step there is a region of A where x


could be and we shrink the size of this region by a factor of 2
with every probe:
I If n is odd, then we are throwing away dn/2e elements.
I If n is even, then we are throwing away at least n/2
elements.
Binary search running time

Observation: At each step there is a region of A where x


could be and we shrink the size of this region by a factor of 2
with every probe:
I If n is odd, then we are throwing away dn/2e elements.
I If n is even, then we are throwing away at least n/2
elements.

Hence the recurrence for the running time is

T (n) ≤ T (n/2) + O(1)


Sublinear running time

Here are two ways to argue about the running time:

1. Master theorem: b = 2, a = 1, k = 0 ⇒ T (n) = O(log n).

2. We can reason as follows: starting with an array of size n,


I After k probes, the array has size at most 2nk (every time
we probe an entry, the active portion of the array halves).
I After k = log n probes, the array has constant size. We
can now search linearly for x in the constant size array.
I We spend constant work to halve the array (why?). Thus
the total work spent is O(log n).
Concluding remarks on binary search

1. The right data structure can improve the running time of


the algorithm significantly.
I What if we used a linked list to store the input?
I Arrays allow for random access of their elements: given
an index, we can read any entry in an array in time O(1)
(constant time).

2. In general, we obtain running time O(log n) when the


algorithm does a constant amount of work to throw
away a constant fraction of the input.
Today

1 Recap

2 Binary search

3 Quicksort

4 Randomized Quicksort

5 Random variables and linearity of expectation

6 Occupancy problems
Quicksort facts

I Quicksort is a divide and conquer algorithm


I It is the standard algorithm used for sorting
I It is an in-place algorithm
I Its worst-case running time is Θ(n2 ) but its average-case
running time is Θ(n log n)
I We will use it to introduce randomized algorithms
Quicksort: main idea

I Pick an input item, call it pivot, and place it in its final


location in the sorted array by re-organizing the array
so that:
I all items ≤ pivot are placed before pivot
I all items > pivot are placed after pivot

≤pivot >pivot

pivot

split

I Recursively sort the subarray to the left of pivot.


I Recursively sort the subarray to the right of pivot.
Quicksort pseudocode

Quicksort(A, lef t, right)


if |A| = 0 then return //A is empty
end if
split = Partition(A, lef t, right)
Quicksort(A, lef t, split − 1)
Quicksort(A, split + 1, right)

Initial call: Quicksort(A, 1, n)


Subroutine Partition(A, lef t, right)
Notation: A[i, j] denotes the portion of A starting at position
i and ending at position j.
Partition(A, lef t, right)
1. picks a pivot item
2. re-organizes A[lef t, right] so that
I all items before pivot are ≤ pivot
I all items after pivot are > pivot
3. returns split, the index of pivot in the re-organized array

After Partition, A[lef t, right] looks as follows:

≤pivot >pivot

pivot

left split right


Implementing Partition

1. Pick a pivot item: for simplicity, always pick the last item
of the array as pivot, i.e., pivot = A[right].
I Thus A[right] will be placed in its final location in the
sorted output when Partition returns; it will never be
used (or moved) again until the algorithm terminates.

2. Re-organize the input array A in place. How?

(What if we didn’t care to implement Partition in place?)


Implementing Partition in place
Partition examines the items in A[lef t, right] one by one and
maintains three regions in A. Specifically, after examining the
j-th item for j ∈ [lef t, right − 1], the regions are:
1. Left region: starts at lef t and ends at split;
A[lef t, split] contains all items ≤ pivot examined so far.
2. Middle region: starts at split + 1 and ends at j;
A[split + 1, j] contains all items > pivot examined so far.
3. Right region: starts at j + 1 and ends at right − 1;
A[j + 1, right − 1] contains all unexamined items.

≤pivot >pivot unexamined

pivot

left split split+1 j j+1 right-1 right

End of j-th iteration


Implementing Partition in place (cont’d)

At the beginning of iteration j, A[j] is compared with pivot.

If A[j] ≤ pivot

1. swap A[j] with A[split + 1], the first element of the


middle region (items > pivot): since A[split + 1] > pivot,
it is “safe” to move it to the end of the middle region

2. increment split to include A[j] in the left region


(items > pivot)
Iteration j: when A[j] ≤ pivot

Beginning of iteration j (assume A[j]≤pivot)

≤pivot >pivot unexamined

pivot

left split split+1 j right

End of iteration j: A[j] got swapped with A[split+1], split got updated to split+1

≤pivot >pivot unexamined

pivot

left j right
split
Example: A = {1, 3, 7, 2, 6, 4, 5}, Partition(A, 1, 7)
pivot
unexamined
≤ pivot
beginning of iteration j=1
> pivot 1 3 7 2 6 4 5 A[j]=1

j
pivot
≤ pivot unexamined

1 3
9 7 2 6 4 5 beginning of iteration j=2
A[j]=3

split j
pivot
≤ pivot unexamined

1 3
9 7 2 6 4 5 beginning of iteration j=3
A[j]=7
split j
pivot
≤ pivot >pivot unexamined
beginning of iteration j=4,
1 3
9 7 2 6 4 5
A[j]=2
split j

>pivot pivot
≤ pivot unexamined
beginning of iteration j=5
1 3
9 2 7 6 4 5
A[j]=6
split j

pivot
≤ pivot > pivot
beginning of iteration j=6
1 3
9 2 7 6 4 5 A[j]=4

split j

pivot
≤ pivot > pivot
end of iteration j=6
1 3
9 2 4 6 7 5 A[j]=4

split j
pivot

≤ pivot > pivot

1 3
9 2 4 5 7 6 end of Partition

split
Pseudocode for Partition

Partition(A, lef t, right)


pivot = A[right]
split = lef t − 1
for j = lef t to right − 1 do
if A[j] ≤ pivot then
swap(A[j], A[split + 1])
split = split + 1
end if
end for
swap(pivot, A[split + 1]) //place pivot after A[split] (why?)
return split + 1 //the final position of pivot
Analysis of Partition: correctness

Notation: A[i, j] denotes the portion of A that starts at


position i and ends at position j.

Claim 1.
For lef t ≤ j ≤ right − 1, at the end of loop j,
1. all items in A[lef t, split] are ≤ pivot; and
2. all items in A[split + 1, j] are > pivot

Remark: If the claim is true, correctness of Partition follows


(why?).
Proof of Claim 1

By induction on j.

1. Base case: For j = lef t (that is, during the first


execution of the for loop), there are two possibilities:
I if A[lef t] ≤ pivot, then A[lef t] is swapped with itself and
split is incremented to equal lef t;
I otherwise, nothing happens.
In both cases, the claim holds for j = lef t.

2. Hypothesis: Assume that the claim is true for some


lef t ≤ j < right − 1.
I That is, at the end of loop j, all items in A[lef t, split] are
≤ pivot and all items in A[split + 1, j] are > pivot.
Proof of Claim 1 (cont’d)

3. Step: We will show the claim for j + 1. That is, we will show that
after loop j + 1, all items in A[lef t, split] are ≤ pivot and all items in
A[split + 1, j + 1] are > pivot.

I At the beginning of loop j + 1, by the hypothesis, items in


A[lef t, split] are ≤ pivot and items in A[split + 1, j] are > pivot.
I Inside loop j + 1, there are two possibilities:
1. A[j + 1] ≤ pivot: then A[j + 1] is swapped with A[split + 1].
At this point, items in A[lef t, split + 1] are ≤ pivot and
items in A[split + 2, j + 1] are > pivot. Incrementing split
(the next step in the pseudocode) yields that the claim
holds for j + 1.
2. A[j + 1] > pivot: nothing is done. The truth of the claim
follows from the hypothesis.

This completes the proof of the inductive step.


Analysis of Partition: running time and space

I Running time: on input size n, Partition goes through


each of the n − 1 leftmost elements once and performs
constant amount of work per element.
⇒ Partition requires Θ(n) time.

I Space: in-place algorithm


Analysis of Quicksort: correctness

I Quicksort is a recursive algorithm; we will prove


correctness by induction on the input size n.

I We will use strong induction: the induction step at n


requires that the inductive hypothesis holds at all steps
1, 2, . . . , n − 1 and not just at step n − 1, as with simple
induction.

I Strong induction is most useful when several instances of


the hypothesis are required to show the inductive step.
Analysis of Quicksort: correctness

I Base case: for n = 0, Quicksort sorts correctly.

I Hypothesis: for all 0 ≤ m < n, Quicksort correctly


sorts on input size m.

I Step: show that Quicksort correctly sorts on input size n.


I Partition(A, 1, n) re-organizes A so that all items
I in A[1, . . . , split − 1] are ≤ A[split];
I in A[split + 1, . . . , n] are > A[split].

I Next, Quicksort(A, 1, split − 1), Quicksort(A, split + 1, n)


will correctly sort their inputs (by the hypothesis). Hence

A[1] ≤ . . . ≤ A[split − 1] and A[split + 1] ≤ . . . ≤ A[n].

At this point, Quicksort terminates and A is sorted.


Analysis of Quicksort: space and running time

I Space: in-place algorithm

I Running time T (n): depends on the arrangement of the


input elements
I the sizes of the inputs to the two recursive calls –hence the
form of the recurrence– depend on how pivot compares to
the rest of the input items
Running time of Quicksort: Best Case

Suppose that in every call to Partition the pivot item is the


median of the input.

Then every Partition splits its input into two lists of almost
equal sizes, thus

T (n) = 2T (n/2) + Θ(n) = O(n log n).

This is a “balanced” partitioning.


I Example of best case: A = [1 3 2 5 7 6 4]

Remark 1.
You can show that T (n) = O(n log n) for any splitting where the
two subarrays have sizes αn, (1 − α)n respectively, for constant
0 < α < 1.
Running time of Quicksort: Worst Case

I Upper bound for worst-case running time: T (n) = O(n2 )


I at most n calls to Partition (one for each item as pivot)
I Partition requires O(n) time

I This worst-case upper bound is tight:


I If every time Partition is called pivot is greater (or
smaller) than every other item, then its input is split into
two lists, one of which has size 0.
I This partitioning is very “unbalanced”: let c, d > 0 be
constants, where T (0) = d; then

T (n) = T (n − 1) + T (0) + cn = Θ(n2 ).

4 A worst-case input is the sorted input!


Running time: average case analysis

Average case: what is an “average” input to sorting?


I Depends on the application.
I Inntuition why average-case analysis for uniformly
distributed inputs to Quicksort is O(n log n) appears in
your textbook.
I We will use randomness within the algorithm to provide
Quicksort with a uniform at random input.
Today

1 Recap

2 Binary search

3 Quicksort

4 Randomized Quicksort

5 Random variables and linearity of expectation

6 Occupancy problems
Two views of randomness in computation

1. Deterministic algorithm, randomness over the inputs


I On the same input, the algorithm always produces the
same output using the same time.
I So far, we have only encountered such algorithms.
I The input is randomly generated according to some
underlying distribution.
I Average case analysis: analysis of the running time of the
algorithm on an average input.
Two views of randomness in computation (cont’d)

2. Randomized algorithm, worst-case (deterministic) input


I On the same input, the algorithm produces the same
output but different executions may require different
running times.
I The latter depend on the random choices of the algorithm
(e.g., coin flips, random numbers).
I Random samples are assumed independent of each other.
I Worst-case input
I Expected running time analysis: analysis of the running
time of the randomized algorithm on a worst-case input.
Remarks on randomness in computation

1. Deterministic algorithms are a special case of


randomized algorithms.
2. Even when equally efficient deterministic algorithms exist,
randomized algorithms may be simpler, require less
memory of the past or be useful for symmetry-breaking.
Randomized Quicksort

Can we use randomization so that Quicksort works with an


“average” input even when it receives a worst-case input?

1. Explicitly permute the input.

2. Use random sampling to choose pivot: instead of using


A[right] as pivot, select pivot randomly.

Idea 1 (intuition behind random sampling).


No matter how the input is organized, we won’t often pick the
largest or smallest item as pivot (unless we are really, really
unlucky). Thus most often the partitioning will be “balanced”.
Pseudocode for randomized Quicksort

Randomized-Quicksort(A, lef t, right)


if |A| == 0 then return //A is empty
end if
split = Randomized-Partition(A, lef t, right)
Randomized-Quicksort(A, lef t, split − 1)
Randomized-Quicksort(A, split + 1, right)

Randomized-Partition(A, lef t, right)


b = random(lef t, right)
swap(A[b], A[right])
return Partition(A, lef t, right)

Subroutine random(i, j) returns a random number between i and j


inclusive.
Today

1 Recap

2 Binary search

3 Quicksort

4 Randomized Quicksort

5 Random variables and linearity of expectation

6 Occupancy problems
Discrete random variables

I To analyze the expected running time of a randomized


algorithm we keep track of certain parameters and their
expected size over the random choices of the algorithm.

I To this end, we use random variables.

I A discrete random variable X takes on a finite number of


values, each with some probability. We’re interested in its
expectation X
E[X] = j · Pr[X = j].
j
Example 1: Bernoulli trial

Experiment 1: flip a biased coin which comes up


I heads with probability p
I tails with probability 1 − p

Question: what is the expected number of heads?


Example 1: Bernoulli trial

Experiment 1: flip a biased coin which comes up


I heads with probability p
I tails with probability 1 − p

Question: what is the expected number of heads?

Let X be a random variable such that



1 , if coin flip comes heads
X=
0 , if coin flip comes tails
Example 1: Bernoulli trial

Experiment 1: flip a biased coin which comes up


I heads with probability p
I tails with probability 1 − p

Question: what is the expected number of heads?

Let X be a random variable such that



1 , if coin flip comes heads
X=
0 , if coin flip comes tails

Then
Pr[X = 1] = p
Pr[X = 0] = 1 − p
E[X] = 1 · Pr[X = 1] + 0 · Pr[X = 0] = p
Indicator random variables

I Indicator random variable: a discrete random variable that


only takes on values 0 and 1.
I Indicator random variables are used to denote occurrence
(or not) of an event.

Example: in the biased coin flip example, X is an indicator


random variable that denotes the occurrence of heads.

Fact 2.
If X is an indicator random variable, then E[X] = Pr[X = 1].
Example 2: Bernoulli trials

Experiment 2: flip the biased coin n times

Question: what is the expected number of heads?


Example 2: Bernoulli trials

Experiment 2: flip the biased coin n times

Question: what is the expected number of heads?

Answer 1: Let X be the random variable counting the number


of times heads appears.
n
X
E[X] = j · Pr[X = j].
j=0

Pr[X = j]?
Example 2: Bernoulli trials

Experiment 2: flip the biased coin n times

Question: what is the expected number of heads?

Answer 1: Let X be the random variable counting the number


of times heads appears.
n
X
E[X] = j · Pr[X = j].
j=0

Pr[X = j]?

X follows the binomial distribution B(n, p), thus


 
n j
Pr[X = j] = p (1 − p)n−j
j
Example 2: Bernoulli trials
A different way to think about X:

Answer 2: for 1 ≤ i ≤ n, let Xi be an indicator random


variable such that

1 , if i-th coin flip comes heads
Xi =
0 , if i-th coin flip comes tails
Example 2: Bernoulli trials
A different way to think about X:

Answer 2: for 1 ≤ i ≤ n, let Xi be an indicator random


variable such that

1 , if i-th coin flip comes heads
Xi =
0 , if i-th coin flip comes tails

Define the random variable


n
X
X= Xi
i=1

We want E[X]. By Fact 2, E[Xi ] = p, for all i.


Linearity of expectation

n
X
X= Xi , E[Xi ] = p, E[X] =?
i=1
Linearity of expectation

n
X
X= Xi , E[Xi ] = p, E[X] =?
i=1

Remark 1: X is a complicated random variable defined as the


sum of simpler random variables whose expectation is known.
Linearity of expectation

n
X
X= Xi , E[Xi ] = p, E[X] =?
i=1

Remark 1: X is a complicated random variable defined as the


sum of simpler random variables whose expectation is known.

Proposition 1 (Linearity of expectation).


Let X1 , . . . , Xk be arbitrary random variables. Then

E[X1 + X2 + . . . + Xk ] = E[X1 ] + E[X2 ] + . . . + E[Xk ]


Linearity of expectation

n
X
X= Xi , E[Xi ] = p, E[X] =?
i=1

Remark 1: X is a complicated random variable defined as the


sum of simpler random variables whose expectation is known.

Proposition 1 (Linearity of expectation).


Let X1 , . . . , Xk be arbitrary random variables. Then

E[X1 + X2 + . . . + Xk ] = E[X1 ] + E[X2 ] + . . . + E[Xk ]

Remark 2: We made no assumptions on the random variables.


For example, they do not need be independent.
Back to example 2: Bernoulli trials

Answer 2: for 1 ≤ i ≤ n, let Xi be an indicator random


variable such that

1 , if i-th coin flip comes heads
Xi =
0 , if i-th coin flip comes tails

Define the random variable


n
X
X= Xi
i=1

By Fact 2, E[Xi ] = p, for all i. By linearity of expectation,

n
hX i n
X n
X
E[X] = E Xi = E[Xi ] = p = np.
i=1 i=1 i=1
Today

1 Recap

2 Binary search

3 Quicksort

4 Randomized Quicksort

5 Random variables and linearity of expectation

6 Occupancy problems
Balls in bins problems

Occupancy problems: find the distribution of balls into bins


when m balls are thrown independently and uniformly at
random into n bins.
I Applications: analysis of randomized algorithms and data
structures (e.g., hash table)

Q1: How many balls can we throw before it is more likely than
not that some bin contains at least two balls?

In symbols: find k such that

Pr[∃ bin with ≥ 2 balls after k balls thrown] > 1/2


Easier to analyze the complement of this event

Easier to think about the probability of the complementary


event.

Q1 (rephrased): Find k such that

Pr[every bin has ≤ 1 ball after k balls thrown] ≤ 1/2


Analysis: one ball at a time

I The 1st ball falls into some bin.


I The 2nd ball falls into a new bin w. prob. 1 − n1 .
I The 3rd ball falls into a new bin (given that the first two
balls fell into different bins) w. prob. 1 − n2 .
I The m-th ball falls into a new bin (given that the first
k − 1 balls fell into different bins) w. prob. 1 − k−1
n .

By the chain rule of conditional probability, the probability that


the k-th ball falls into a new bin is given by
k−1
Y 
i
1− (1)
n
i=1
Application: the birthday paradox

Use 1 + x ≤ ex for all x ≥ 0 to upper bound (1)


k−1
Y Pk−1

k(k−1) k2
e−i/n = e− i=1 i/n
=e (2·n) ≈ e− 2n (2)
i=1

k2 √ √
Requiring e− 2n < 1/2 yields k > n · 2 ln 2 = Ω( n).

I Application: birthday paradox


Assumption: For n = 365, each person has an independent
and uniform at random birthday from among the 365 days
of the year.
Once 23 people are in a room, it is more likely than not
that two of them share a birthday.
More balls-in-bins questions

I Q2: What is the expected load of a bin after m balls are


thrown?

I Q3: What is the expected #empty bins after m balls are


thrown?

I Q4: What is the load of the fullest bin with high probability?

I Q5: What is the expected number of balls until every bin


has at least one ball (Coupon Collector’s Problem)?
Expected load of a bin

Suppose that m balls are thrown independently and uniformly


at random into n bins. Fix a bin.
I Let Xi be an indicator r.v. such that Xi = 1 if and only if
ball i falls in the fixed bin. Then
1
E[Xi ] = Pr[Xi = 1] = .
n
Pm
The total #balls in the bin is given by X = i=1 Xi . By
linearity of expectation,
m
X
E[X] = E[Xi ] = m/n.
i=1

Since bins are symmetric, the expected load of any bin is m/n.
Expected # empty bins

Suppose that m balls are thrown independently and uniformly


at random into n bins. Fix a bin j.
I Let Yj be an indicator r.v. such that Yj = 1 if and only if
bin j is empty.
I Pr[ball i does not fall in bin j] = 1 − 1/n
I Pr[for all i, ball i does not fall in bin j] = (1 − 1/n)m
I Hence Pr[Yj = 1] = (1 − 1/n)m .
The number of empty bins is given by the random variable
Y = nj=1 Yj . By linearity of expectation
P

n  m
X 1
E[Y ] = E[Yj ] = 1− ≈ ne−m/n
n
j=1
Maximum load with high probability (case m = n)

Proposition 2.
When throwing n balls into n bins uniformly and independently
at random, the maximum load in any bin is Θ(ln n/ ln ln n) with
probability close to 1 as n grows large.

Two-sentence sketch of the proof.

1. Upper bound the probability that any bin contains more than k
n Pn  1 ` n−`
n
1 − n1
P
balls by a union bound: ` n .
j=1 `=k

2. Compute the smallest possible k ∗ such that the probability above


is less than 1/n; the latter becomes negligible as n grows large.
Expected #balls until no empty bins
Suppose that we throw balls independently and uniformly at random
into n bins, one at a time (the first ball falls at time t = 1).
I We call a throw a success if it lands in an empty bin.
I We call the sequence of balls starting after the (j − 1)-st success
and ending with the j-th success, the j-th epoch.
I To understand the process terminates, we need analyze the
duration of each epoch.
I To this end, let ηj be the #balls thrown in epoch j.
I Clearly the first ball is a success, hence η1 = 1.
I Let η2 be the #balls thrown in epoch 2.
n−1
∀t ∈ epoch 2, Pr[ball t in epoch 2 is a success] =
n
I Similarly, let ηj be the #balls thrown in epoch j.
n−j+1
∀t ∈ epoch j, Pr[ball t in epoch j is a success] =
n
At the end of the n-th epoch, each of the n bins has at least one ball.
Expected #balls until no empty bins (cont’d)
Pn
Let η = j=1 ηj . We want
 
Xn n
X
E[η] = E  ηj =
 E[ηj ]
j=1 j=1

I Each epoch is geometrically distributed with success


probability pj = n−j+1
n .
I Recall that the expectation of a geometrically distributed
variable with success probability p is given by 1/p.
1 n
I Thus E[ηj ] = pj = n−j+1 .
Then
n n
X n X1
E[η] = =n = n(ln n + O(1))
n−j+1 j
j=1 j=1
Probability review

I A sample space Ω consists of the possible outcomes of an


experiment.
I Each point x in the sample space hasPan associated
probability mass p(x) ≥ 0, such that x∈Ω p(x) = 1.
I Example experiment: flip a fair coin;
Ω = {heads, tails}; Pr[heads] = Pr[tails] = 1/2.
I We define an event E to be any subset of Ω, that is, a
collection of points in the sample space.
I We define the probability of the event to be the sum of the
probability masses of all the points in E. That is,
X
Pr[E] = p(x)
x∈E

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy