0% found this document useful (0 votes)
68 views50 pages

Randomized Alg

The document describes Karger's randomized algorithm for finding the minimum cut in an undirected, unweighted graph. The algorithm works by repeatedly contracting random edges until only two nodes remain, then outputting the resulting cut. It is shown that choosing edges uniformly at random results in at most a 2/n probability of contracting an edge across the minimum cut in each step. Through repetition, this algorithm can find the minimum cut with high probability.

Uploaded by

ballechase
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views50 pages

Randomized Alg

The document describes Karger's randomized algorithm for finding the minimum cut in an undirected, unweighted graph. The algorithm works by repeatedly contracting random edges until only two nodes remain, then outputting the resulting cut. It is shown that choosing edges uniformly at random results in at most a 2/n probability of contracting an edge across the minimum cut in each step. Through repetition, this algorithm can find the minimum cut with high probability.

Uploaded by

ballechase
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Randomized Algorithms

Randomized Algorithms

• Algorithm can make random decisions


• Why randomized algorithms? Simple and efficient
• Examples: Symmetry-breaking, graph algorithms,
quicksort, hashing, load balancing, cryptography, etc
Properties of Random Variables

For independent events A and B,


Pr(A and B) = Pr(A) Pr(B)

Union Bound: For any two events A and B,


Pr(A U B) <= Pr(A) + Pr(B)

Conditional Probability: For events A and B,


Pr(A | B) = Pr(A and B) / Pr(B)
Global Min-Cut

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Example:
V-A

G G

Note: A global min-cut is not the same as an s-t min cut


How can we find a global min-cut using n - 1 max-flows?
We can do better for the global min-cut
Karger’s Min-Cut Algorithm

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}
Karger’s Min-Cut Algorithm

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Example:
a b c d

#edges to pick from = 14


Pick (b, f) (probability 1/14)

e f g h
Karger’s Min-Cut Algorithm

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Example:
a c d

#edges to pick from = 13


Pick (g, h) (probability 1/13)
bf
e g h
Karger’s Min-Cut Algorithm

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Example:
a c d

#edges to pick from = 12


Pick (d, gh) (probability 1/6)
bf
e gh
Karger’s Min-Cut Algorithm

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Example:
a c

#edges to pick from = 10


Pick (a, e) (probability 1/10)
bf
e dgh
Karger’s Min-Cut Algorithm

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Example:
c

#edges to pick from = 9


Pick (ae, bf) (probability 4/9)
ae bf
dgh
Karger’s Min-Cut Algorithm

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Example:
c

#edges to pick from = 5


Pick (c, dgh) (probability 3/5)
aebf
dgh
Karger’s Min-Cut Algorithm

Problem: Given undirected, unweighted graph G = (V, E), find a cut (A,V - A)
which has the minimum number of edges across it

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Example: Original Graph:


a b c d

aebf cdgh

Done! Output (aebf, cdgh) e f g h


Karger’s Algorithm: Analysis

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Fact 1. If there are n nodes, then the average degree of a node is 2|E|/n
Karger’s Algorithm: Analysis

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Fact 1. If there are n nodes, then the average degree of a node is 2|E|/n
Proof: Total degree of n nodes = 2|E|, n nodes in total
Karger’s Algorithm: Analysis

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Fact 1. If there are n nodes, then the average degree of a node is 2|E|/n
Fact 2. The minimum cut size is at most 2|E|/n
Proof: From Fact 1, there is at least one node x with degree at most 2|E|/n

V - {x}
x
Karger’s Algorithm: Analysis

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Fact 1. If there are n nodes, then the average degree of a node is 2|E|/n
Fact 2. The minimum cut size is at most 2|E|/n
Proof: From Fact 1, there is at least one node x with degree at most 2|E|/n
The cut (x,V - x) has deg(x) edges. The minimum cut size is thus at most deg(x) <= 2|E|/n

V - {x}
x
Karger’s Algorithm: Analysis

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Fact 1. If there are n nodes, then the average degree of a node is 2|E|/n
Fact 2. The minimum cut size is at most 2|E|/n
Fact 3. If we choose an edge uniformly at random (uar), the probability that it lies
across the min cut is at most 2/n
Proof: Follows directly from Fact 2
Karger’s Algorithm: Analysis

Karger’s Min-Cut Algorithm:


1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Fact 1. If there are n nodes, then the average degree of a node is 2|E|/n
Fact 2. The minimum cut size is at most 2|E|/n
Fact 3. If we choose an edge uniformly at random (uar), the probability that it lies
across the min cut is at most 2/n
Karger’s Algorithm: Analysis
Karger’s Min-Cut Algorithm:
1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Fact 1. If there are n nodes, then the average degree of a node is 2|E|/n
Fact 2. The minimum cut size is at most 2|E|/n
Fact 3. If we choose an edge uniformly at random (uar), the probability that it lies
across the min cut is at most 2/n
Observe: Bad case is when the algorithm selects an edge e across the min-cut
Karger’s Algorithm: Analysis
Karger’s Min-Cut Algorithm:
1. Repeat until two nodes remain:
Pick an edge e = (u, v) in E uniformly at random. Collapse u and v into a single
node (allowing multiple edges)
2. Let u, v be the nodes. Output (U,V - U), where U = {nodes that went into u}

Fact 1. If there are n nodes, then the average degree of a node is 2|E|/n
Fact 2. The minimum cut size is at most 2|E|/n
Fact 3. If we choose an edge uniformly at random (uar), the probability that it lies
across the min cut is at most 2/n
Observe: Bad case is when the algorithm selects an edge e across the min-cut
Pr[ Output cut = min-cut ] = Pr[First selected edge not in min-cut] x Pr[Second
selected edge not in min-cut|1st edge] x ....
✓ ◆✓ ◆✓ ◆ ✓ ◆
2 2 2 2
1 1 1 ... 1
n n 1 n 2 3
n 2 n 3 n 4 2 1 2
= · · ... · =
n n 1 n 2 4 3 n(n 1)

Thus, outputs min-cut w.p. 2/n2; can run it O(n2) times and pick the best of the outputs
Types of Randomized Algorithms

Monte Carlo Algorithm:


Always has the same running time
Not guaranteed to return the correct answer (returns a
correct answer only with some probability)

Las Vegas Algorithm:


Always guaranteed to return the correct answer
Running time fluctuates (probabilistically)

Fact: Suppose a Monte Carlo algorithm succeeds w.p. p. Then, it can be


made to succeed w.p. 1 - t for any (small) t by running it O(log (1/t)/p) time
Proof: Suppose we run the algorithm k times. Then,
Pr[Algorithm is wrong every time] = (1 - p)k < t
when k = O(log (1/t)/p)
Expectation

Given discrete random variable X, which takes m values xi w.p. pi, the expectation E[X] is
defined as: Xm Xm
E[X] = xi · Pr[X = xi ] = x i pi
i=1 i=1

Examples:

1. Let X = 1 if a fair coin toss comes up heads, 0 ow. What is E[X]?

2. We are tossing a coin with head probability p, tail probability 1 - p. Let X = #independent
flips until first head. What is E[X]?
Pr[X = j] = p x (1 - p)j - 1
head on toss j first j-1 tails
1
X 1
X
p p 1 p 1
E[X] = j · p(1 p)j 1
= j(1 p)j = · =
j=1
1 p j=0
1 p p2 p
Expectation

Given discrete random variable X, which takes m values xi w.p. pi, the
expectation E[X] is defined as:
m
X m
X
E[X] = xi · Pr[X = xi ] = x i pi
i=1 i=1

Linearity of Expectation: E[X + Y] = E[X] + E[Y]


Expectation
Given discrete random variable X, which takes m values xi w.p. pi, the expectation E[X] is
defined as: Xm Xm
E[X] = xi · Pr[X = xi ] = x i pi
i=1 i=1

Linearity of Expectation: E[X + Y] = E[X] + E[Y]

Example: Guessing a card


Shuffle n cards, then turn them over one by one. Guess what the next card is.
How many guesses are correct on expectation?

Let Xi = 1 if guess i is correct, 0 otherwise


1 1
Pr[Xi = 1] = E[Xi ] =
n i+1 n i+1
n
X n
X n
X 1
Expected # of correct guesses = E[ Xi ] = E[Xi ] =
i
= (log n)
i=1 i=1 i=1

What if we insert the selected card into the pile randomly and pull another ?
Expectation
Given discrete random variable X, which takes m values xi w.p. pi, the expectation E[X] is
defined as: Xm Xm
E[X] = xi · Pr[X = xi ] = x i pi
i=1 i=1

Linearity of Expectation: E[X + Y] = E[X] + E[Y]

Example: Coupon Collector’s Problem


Balls tossed randomly into n bins. How many balls in expectation before each bin has a ball?

Let Xj = time spent when there are exactly j non-empty bins (Phase j)
Let X = total #steps = X1 + X2 + ... + Xn-1
We move from phase j to j+1 when a ball hits one of n-j bins, so w.p. (n-j) /n
Therefore, E[Xj] = n/(n - j) [From previous slide on expected waiting times]
E[X] = E[X1 + .. + Xn-1] = E[X1] + .. + E[Xn-1] = n + n/2 + .. + n/(n-1) = (n log n)
Expectation
Given discrete random variable X, which takes m values xi w.p. pi, the expectation E[X] is
defined as: Xm Xm
E[X] = xi · Pr[X = xi ] = x i pi
i=1 i=1

Linearity of Expectation: E[X + Y] = E[X] + E[Y]

Example: Birthday Paradox


m balls tossed randomly into n bins. What is the expected #collisions?

For 1 <= i < j <= m, let Xij = 1 if balls i and j land in the same bin, 0 otherwise
Pr[ Xij = 1 ] = 1/n, so E[Xij] = 1/n
X m
m(m 1)
2
So, expected number of collisions from tossing m balls = E[Xij ] =
n
=
2n
i,j

So when m < sqrt(2n), expected #collisions < 1; otherwise, it’s more


Variance
Given a random variable X, its variance Var[X] is defined as: Var(X) = E[(X - E[X])2]

120 600

100 500

80 400

60 300

40 200

20 100

0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

High Variance Distribution Low Variance Distribution

Variance of a random variable measures its “spread”


Properties of Variance:
Student Version of MATLAB

1. Var(X) = E[X2] - (E[X])2 Student Version of MATLAB

2. If X and Y are independent random variables, then, Var(X + Y) = Var(X) + Var(Y)


3. For any constants a and b, Var(aX + b) = a2 Var(X)
Computing Percentiles

An f-th percentile is the value below which f percent of observations fall

f-th percentile
P(X = x)
Area = f

Given array A[1..n], find the k-th smallest element in A

Example: Median = n/2-th smallest element = 50th percentile


How to compute the median in O(n log n) time?
Randomized Selection
Given array A[1..n], find the k-th smallest element in A

A Divide and Conquer Algorithm:


Select(A, k)
1. Pick an item v in A
2. Let:
How to select v?
AL = all elements in A that are < v
AM = all elements in A that are = v Pick v uniformly at
AR = all elements in A that are > v random in 1..n
3. Return:
Select(AL, k) if k <= |AL|
v if |AL| < k <= |AL| + |AM|
Select(AR, k - |AL| - |AM|) otherwise

Example:
A= 2 36 5 21 8 13 11 20 5 4 1 v=5

AL = 2 4 1 AM = 5 5 AR = 36 21 8 13 11 20
Randomized Selection
Given array A[1..n], find the k-th smallest element in A

A Divide and Conquer Algorithm:


Select(A, k)
1. Pick an item v in A
2. Let:
AL = all elements in A that are < v How to select v?
AM = all elements in A that are = v Pick v uniformly at
AR = all elements in A that are > v random in 1..n
3. Return:
Select(AL, k) if k <= |AL|
v if |AL| < k <= |AL| + |AM|
Select(AR, k - |AL| - |AM|) otherwise

Worst case: v = smallest or largest element Best case: v is the k-th element
Time T(n) = O(n) (for splitting) + T(n-1) Time taken T(n) = O(n) (for splitting) + O(1)
Solving the recurrence, T(n) = O(n2) T(n) = O(n)
! 2 2 2 2n !
Pr(Worst Case) = · · ... ⇥ Pr(Best Case) >= 1/n
n n 1 2 n!
Randomized Selection
Given array A[1..n], find the k-th smallest element in A

A Divide and Conquer Algorithm:


Select(A, k)
1. Pick an item v in A
2. Let:
AL = all elements in A that are < v How to select v?
AM = all elements in A that are = v Pick v uniformly at
AR = all elements in A that are > v random in 1..n
3. Return:
Select(AL, k) if k <= |AL|
v if |AL| < k <= |AL| + |AM|
Select(AR, k - |AL| - |AM|) otherwise

Average case: Let T(n) be the expected running time on an array of size n
Lucky split: v is the m-th smallest element, for n/4 <= m <= 3n/4. Pr[Lucky Split] = 1/2
T(n) <= Time to split + Pr[Lucky Split] x T(array of size <= 3n/4)
+ Pr[Unlucky Split] x T(array of size <= n)
<= n + (1/2) T(3n/4) + (1/2) T(n)
Solving, T(n) <= T(3n/4) + 2n = O(n)
Randomized Sorting
Given array A[1..n], sort A

QuickSort:
Sort(A)
1. Pick an item v in A
2. Let:
How to select v?
AL = all elements in A that are < v
AM = all elements in A that are = v Pick v uniformly at random in 1..n
AR = all elements in A that are > v
3. Return:
Sort(AL) + AM + Sort(AR)

Best case: [n/2, n/2] split


Running Time T(n) = (time to split) + 2 T(n/2) = n + 2 T(n/2)
Solving, T(n) = O(n log n)

Worst case: [1, n - 1] split


Running Time T(n) = (time to split) + T(1) + T(n - 1) = n + 1 + T(n - 1)
Solving, T(n) = O(n2)
Randomized Sorting
Given array A[1..n], sort A

QuickSort:
Sort(A)
1. Pick an item v in A
2. Let:
How to select v?
AL = all elements in A that are < v
AM = all elements in A that are = v Pick v uniformly at random in 1..n
AR = all elements in A that are > v
3. Return:
Sort(AL) + AM + Sort(AR)

Average case: Let T(n) be the expected running time on an array of size n
T(n) = Time to split + expected time to sort AL and AR
n
X
=n+ Pr[v is the ith smallest element in A] · (T (i) + T (n i))
i=1
Xn
1
=n+ (T (i) + T (n i))
n i=1
n
X1 1 2 1 2
Exercise: Solve the recurrence to T(n) = O(n log n). Use: k log k 
2
n log n
8
n
k=1
MAX 3SAT
3-SAT Problem: Given a boolean formula F consisting of:
n variables x1, x2, .., xn
m clauses of size 3 of the form xi V xj V xk or not(xi) V xj V xk
Is there an assignment of true/false values to variables s.t. all clauses are true?

Example:
3 variables: x1, .., x3
Clauses:
x1 V x2 V x3, not(x1) V x2 V x3, x1 V not(x2)V x3, x1Vx2 V not(x3), not(x1) V not(x2) V x3,
not(x1) Vx2 V not(x3), x1 V not(x2) V not(x3), not(x1) V not(x2) V not(x3)
Unsatisfiable!

MAX-3SAT Problem: Given a boolean formula F consisting of:


n variables x1, x2, .., xn
m clauses of size 3 of the form xi V xj V xk or not(xi) V xj V xk
Find an assignment of true/false values to variables to satisfy the most clauses

Example:
Any assignment satisfies 7 out of 8 clauses
MAX 3SAT
MAX-3SAT Problem: Given a boolean formula F consisting of:
n variables x1, x2, .., xn
m clauses of size 3 of the form xi V xj V xk or not(xi) V xj V xk
Find an assignment of true/false values to variables to satisfy the most clauses

A Randomized MAX-3SAT Algorithm:


Set each variable to 0/1 independently with probability 1/2 each

Define: Zi = 1, if clause i is satisfied by the assignment


Zi = 0, otherwise
Pr[Zi = 0] = (1/2) . (1/2) . (1/2) = 1/8
E[Zi] = 7/8
Let Z = Z1 + Z2 + ... + Zm = #satisfied clauses
E[Z] = E[Z1 + Z2 + ... + Zm] = E[Z1] + E[Z2] + .. + E[Zm] = 7m/8 = E[#satisfied clauses]

How to get a solution with >= 7m/8 satisfied clauses?


Fact: P = Pr[Solution has >= 7m/8 satisfied clauses] >= 1/8m
Proof: Let pj = Pr[Solution has j satisfied clauses], k = largest integer < 7m/8
Xk m
X
7m 7m
k
= E[Z] = jpj + jpj  k + mP !P 8 1 As m, k are integers,
8 m 8m
j=0 j=k+1 7m/8 - k >= 1/8
MAX 3SAT
MAX-3SAT Problem: Given a boolean formula F consisting of:
n variables x1, x2, .., xn
m clauses of size 3 of the form xi V xj V xk or not(xi) V xj V xk
Find an assignment of true/false values to variables to satisfy the most clauses

A Randomized MAX-3SAT Algorithm:


Set each variable to 0/1 independently with probability 1/2 each

Define: Zi = 1, if clause i is satisfied by the assignment


Zi = 0, otherwise
Pr[Zi = 0] = (1/2) . (1/2) . (1/2) = 1/8
E[Zi] = 7/8
Let Z = Z1 + Z2 + ... + Zm = #satisfied clauses
E[Z] = E[Z1 + Z2 + ... + Zm] = E[Z1] + E[Z2] + .. + E[Zm] = 7m/8 = E[#satisfied clauses]

How to get a solution with >= 7m/8 satisfied clauses?


Fact: Pr[Solution has >= 7m/8 satisfied clauses] >= 1/8m
Solution: Run algorithm 8m log(1/t) times independently. W.p. 1 - t, there will be a
solution with at least 7m/8 satisfied clauses.
Variance
Given a random variable X, its variance Var[X] is defined as: Var(X) = E[(X - E[X])2]

120 600

100 500

80 400

60 300

40 200

20 100

0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

High Variance Distribution Low Variance Distribution

If X and Y are independent random variables, then, Var(X + Y) = Var(X) + Var(Y)


What if X and Y are not independent?
Student Version of MATLAB

Cov(X,Y) = E(XY) - E(X) E(Y) = E( (X - E[X]) (Y - E[Y]))


Student Version of MATLAB

Cov(X,Y) measures how closely X,Y are “correlated” (in a loose sense)
Var(X+Y) = Var(X) + Var(Y) + 2 Cov(X,Y) [for general r.v. X and Y]
What is Cov(X,Y) if X and Y are independent?
Inequality 1: Markov’s Inequality
Pr(X=x)

Pr(X >= a)

0 x a
If X is a random variable which takes non-negative values, and a > 0, then
E[X]
Pr[X a] 
a

Example: n tosses of an unbiased coin. X = #heads


E[X] = n/2. Let a = 3n/4. By Markov’s Inequality, Pr(X >= a) <= 2/3. But what is it really?
✓✓ ◆ ✓ ◆ ✓ ◆◆ ✓ ◆
3n n n n n
Pr[X ⇥ ] + + ... + ·2 n
n·2 n· Fact: If ✓n >=

k
4 3n/4 3n/4 + 1 n n/4 ⇣ n ⌘k ⇣ ne ⌘k
n
 
n·2 n
· (4e)n/4 n(e/4)n/4 < (e/3)n/4 for large n k k k

Summary: Markov’s inequality can be weak, but it only requires E[X] to be finite!
Inequality 2: Chebyshev’s Inequality
Pr(X=x)

Pr(X <= E[X] - a) Pr(X >= E[X] + a)

0 E[X] - a x E[X] + a

If X is a random variable and a > 0, then


V ar(X)
Pr[|X E[X]| ⇤ a] ⇥
a2

Example: n tosses of an unbiased coin. X = #heads


E[X] = n/2 Var[X] = n/4 (how would you compute this?)
From last slide, Pr(X >= 3n/4) <= cn/4 for some constant c < 1, and large enough n
Let a = n/4, so that we compute Pr(X >= 3n/4). By Chebyshev, Pr(X >= 3n/4) <= 4/n

Summary: Chebyshev’s inequality can also be weak, but only requires finite Var[X], E[X]
Inequality 3: Chernoff Bounds
Pr(X=x)

Pr(X<=(1-t)E[X]) Pr(X>=(1+t)E[X])

0 x a

Let X1, .., Xn be independent 0/1 random variables, and X = X1 + .. + Xn. Then, for any t>0,
✓ ◆E[X]
et
Pr(X (1 + t)E[X]) 
(1 + t)(1+t)
Moreover, for t < 1,
1 2
Pr(X  (1 t)E[X])  e 2 t E[X]

Example: n tosses of an unbiased coin. X=#heads= X1 + ... + Xn where Xi=1 if toss i =head
E[X] = n/2. Pr[ X >= 3n/4] = Pr[ X >= (1 + 1/2) E[X]), so t = 1/2
Thus from Chernoff Bounds,
⇣ ⌘n/2 Summary: Stronger bound,
1/2 3/2 n/2
Pr(X ⇥ 3n/4) e · (2/3) (0.88) but needs independence!
Chernoff Bounds: Simplified Version
Pr(X=x)

Pr(X<=(1-t)E[X]) Pr(X>=(1+t)E[X])

0 x a

Let X1, .., Xn be independent 0/1 random variables, and X = X1 + .. + Xn. Then, for any t>0,
✓ ◆E[X]
et
Pr(X (1 + t)E[X]) 
(1 + t)(1+t)
Moreover, for t < 1,
1 2
Pr(X  (1 t)E[X])  e 2 t E[X]

Simplified Version:
Let X1, .., Xn be independent 0/1 random variables, and X = X1 + .. + Xn. Then, for t<2e -1,
t2 E[X]/4
Pr(X > (1 + t)E[X])  e
Randomized Algorithms

• Contention Resolution
• Some Facts about Random Variables
• Global Minimum Cut Algorithm
• Randomized Selection and Sorting
• Max 3-SAT
• Three Concentration Inequalities
• Hashing and Balls and Bins
Hashing and Balls-n-Bins
Problem: Given a large set S of elements x1, .., xn, store them using O(n) space s.t it is
easy to determine whether a query item q is in S or not

1
2 Linked list of all xi s.t h(xi) = 2
Table 3

Popular Data Structure: A Hash table


Algorithm:
1. Pick a completely random function h : U {1, . . . , n}
2. Create a table of size n, initialize it to null
3. Store xi in the linked list at position h(xi) of table
4. For a query q, look at the linked list at location h(q) of table to see if q is there

What is the query time of the algorithm?


Hashing and Balls-n-Bins

Problem: Given a large set S of elements x1, .., xn, store them using O(n) space s.t it is
easy to determine whether a query item q is in S or not

1 Algorithm:
2 1. Pick a completely random function h
Table 2. Create a table of size n, initialize it to null
3
3. Store xi in the linked list at position h(xi) of table
n 4. For a query q, check the linked list at location h(q)

Average Query Time: Suppose q is picked at random s.t it is equally likely to hash to
1, .., n. What is the expected query time?
n
X
Expected Query Time = Pr[q hashes to location i] · (length of list at T [i])]
i=1
1X 1
= (length of list at T [i]) = · n = 1
n i n
Hashing and Balls-n-Bins

Problem: Given a large set S of elements x1, .., xn, store them using O(n) space s.t it is
easy to determine whether a query item q is in S or not

1 Algorithm:
2 1. Pick a completely random function h
Table 2. Create a table of size n, initialize it to null
3
3. Store xi in the linked list at position h(xi) of table
n 4. For a query q, check the linked list at location h(q)

Worst Case Query Time: For any q, what is the query time? (with high probability over
the choice of hash functions)

Equivalent to the following Balls and bins Problem:


Suppose we toss n balls u.a.r into n bins. What is the max #balls in a bin with high probability?
With high probability (w.h.p) = With probability 1 - 1/poly(n)
Balls and Bins, again
Suppose we toss n balls u.a.r into n bins. What is the max load of a bin with high probability?

Some Facts:
1. The expected load of each bin is 1

2. What is the probability that each bin has load 1?


# permutations n!
Probability = = n
# ways of tossing n balls to n bins n

3. What is the expected #empty bins?


✓ ◆n
1
Pr[Bin i is empty] = 1
n
✓ ◆n
1
E[# empty bins] = n 1 = (n) ( (1-1/n)n lies between 1/4 and 1/e for n>=2 )
n
Balls and Bins
Suppose we toss n balls u.a.r into n bins. What is the max load of a bin with high probability?

Let Xi = #balls in bin i


✓ ◆ ⇣ ne ⌘t 1 ⇣ e ⌘t
n 1 1
Pr(Xi t)     2 Fact: If ✓n >= k
t nt t nt t n ◆
⇣ n ⌘k n ⇣ ne ⌘k
 
From Fact Would like this k k k
for whp condition
c log n
Let t = for constant c
log log n
✓ ◆t For large n, this is
t c log n
log = t log t t = · (log c + log log n log log log n) 1
e log log n log log n
2
c
log n 2 log n, for c 4
2
Therefore, w.p. 1/n2, there are at least t balls in Bin i. What is Pr(All bins have <= t balls) ?
Applying Union Bound, Pr(All bins have <=t balls) >= 1 - 1/n
Balls and Bins
Suppose we toss n balls u.a.r into n bins. What is the max load of a bin with high probability?

Fact: W.p. 1-1/n, the maximum load of each bin is at most O(log n/log log n)
Fact: The max loaded bin has (log n/3log log n) balls with probability at least 1 - const./n(1/3)

Let Xi = #balls in bin i


✓ ◆ ✓ ◆n t ⇣ n ⌘t
n 1 1 1 1 1 At least 1/en1/3
Pr(Xi t) 1 · t ·e
t n t n t n ett for t = log n/3 log log n

Let Yi = 1 if bin i has load t or more,


Pr(Yi = 1) >= 1/en1/3
= 0 otherwise
E(Y) >= n2/3 /e
Y = Y1 + Y2 + .. + Yn
Pr(Y = 0) = Pr(No bin has load t or more) <= Pr(|Y - E[Y]| >= E[Y]) Which concentration
bound to use?
Using Chebyshev, Pr(|Y - E[Y]| >= E[Y]) <= Var(Y)/E(Y)2
Balls and Bins
Suppose we toss n balls u.a.r into n bins. What is the max load of a bin with high probability?

Fact: W.p. 1-1/n, the maximum load of each bin is at most O(log n/log log n)
Fact: The max loaded bin has (log n/3log log n) balls with probability at least 1 - const./n(1/3)

Let Yi = 1 if bin i has load t or more,


Pr(Yi = 1) >= 1/en1/3
= 0 otherwise
Y = Y1 + Y2 + .. + Yn E(Y) >= n2/3 /e
Pr(Y = 0) = Pr(No bin has load >= t) <= Pr(|Y - E[Y]| >= E[Y]) <= Var(Y)/E(Y)2 Chebyshev
X X
Var[Y] = Var[(Y1 + .. + Yn)2] = V ar(Yi ) + 2 (E[Yi Yj ] E[Yi ]E[Yj ])
i i6=j

Now if i is not j,Yi and Yj are negatively correlated, which means that E[Yi Yj ] < E[Yi ]E[Yj ]
n
X V ar(Y ) ne2 e2
Thus, V ar(Y ) V ar(Yi ) n·1 Pr(Y = 0)   4/3  1/3
i=1
E(Y )2 n n
The Power of Two Choices
Problem: Given a large set S of elements x1, .., xn, store them using O(n) space s.t it is
easy to determine whether a query item q is in S or not

1
2 Linked list of all xi s.t h(xi) = 2
Table 3

Algorithm:
1. Pick two completely random functions h1 : U {1, . . . , n}, and h2 : U {1, . . . , n}
2. Create a table of size n, initialize it to null
3. Store xi at linked list at position h1(xi) or h2(xi), whichever is shorter
4. For a query q, look at the linked list at location h1(q) and h2(q) of table to see if q is there

Equivalent to the following Balls and Bins Problem: Toss n balls into n bins. For each
ball, pick two bins u.a.r and put the ball into the lighter of the two bins.

What is the worst case query time? Answer: O(log log n) (proof not in this class)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy