CLRS Chapter 11 Solutions
CLRS Chapter 11 Solutions
Exercise 11.1-1
Suppose that a dynamic set S is represented by a direct-address table T of length m. Describe a procedure that finds the
maximum element of S. What is the worst-case performance of your procedure?
Solution:
We can do a linear search to find the maximum element in S as follows:
Pre-condition: table T is not empty; m
Z+, m 1.
Post-condition: FCTVAL == maximum value of dynamic set stored in T.
FindMax (T, m)
{
max =
for i = 1 to m
{
if T[i] != NIL && max < T[i]
max = T[i]
}
return max
}
In the worst-case searching the entire table is needed. Thus the procedure must take O (m) time.
Page 1 of 7
Exercise 11.2-1
Suppose we use a hash function h to hash n distinct keys into an array T of length m. Assuming simple uniform hashing,
what is the expected number of collisions? More precisely, what is the expected cardinality of {{k, l}: k l and h (k) = h (l)}?
Solution:
For each pair of keys k, l, where k l, define the indicator random variable Xlk = I {h (k) = h (l)}. Since we assume simple
uniform hashing, Pr { Xlk = 1} = Pr { h (k) = h (l)} = 1 / m, and so E[Xlk] = 1 / m.
Now define the random variable Y to be the total number of collisions, so that
is
E[Y ] = E[ X kl ]
kl
= E[ X kl ]
since Y = X kl
kl
by linearity of expectation
kl
=
kl
1
m
n 1
=
2 m
=
n(n 1) 1
2
m
n(n 1)
2m
since E[ X kl ] =
1
m
Exercise 11.2-2
Demonstrate what happens when we insert the keys 5, 28, 19, 15, 20, 33, 12, 17, 10 into a hash table with collisions resolved
by chaining. Let the table have 9 slots, and let the hash function be h (k) = k mod 9.
Solution: omitted.
Page 2 of 7
Exercise 11.2-3
Professor Marley hypothesizes that he can obtain substantial performance gains by modifying the chaining scheme to keep
each list in sorted order. How does the professors modification affect the running time for successful searches, unsuccessful
searches, insertions, and deletions?
Solution:
Successful searches: (1 + ), which is identical to the original running time. The element we search for is equally likely
to be any of the elements in the hash table, and the proof of the running time for successful searches is similar to what
we did in the lecture.
Unsuccessful searches: 1/2 of the original running time, but still (1 + ), if we simply assume that the probability that
one element's value falls between two consecutive elements in the hash slot is uniformly distributed. This is because the
value of the element we search for is equally likely to fall between any consecutive elements in the hash slot, and once
we find a larger value, we can stop searching. Thus, the running time for unsuccessful searches is a half of the original
running time. Its proof is similar to what we did in the lecture.
Insertions: (1 + ), compared to the original running time of (1). This is because we need to find the right location
instead of the head to insert the element so that the list remains sorted. The operation of insertions is similar to the
operation of unsuccessful searches in this case.
Deletions: (1 + ), same as successful searches.
Page 3 of 7
Exercise 11.3-3
Consider a version of the division method in which h(k) = k mod m, where m = 2p 1 and k is a character string interpreted in
radix 2p. Show that if we can derive string x from string y by permuting its characters, then x and y hash to the same value.
Give an example of an application in which this property would be undesirable in a hash function.
Solution:
First, we observe that we can generate any permutation by a sequence of interchanges of pairs of characters. One can prove
this property formally, but informally, consider that both heapsort and quicksort work by interchanging pairs of elements and
that they have to be able to produce any permutation of their input array. Thus, it suffices to show that if string x can be
derived from string y by interchanging a single pair of characters, then x and y hash to the same value.
Let xi be the ith character in x, and similarly for yi. We can interpret x in radix 2p as
h(x) = (
, and interpret y as
. So
Suppose that x and y are identical strings of n characters except that the characters in positions a and b are interchanged:
xa = yb and ya = xb.
(1)
Without loss of generality, let a > b. We have:
h(x) h(y) = (
) mod (2p 1) (
) mod (2p 1)
(2)
Since 0
h(x), h(y) < 2p 1, we have that (2p 1) < h(x) h(y) < 2p 1. If we show that (h(x) h(y)) mod (2p 1) = 0, then
h(x) = h(y). To prove (h(x) h(y)) mod (2p 1) = 0, we have:
(h(x) h(y)) mod (2p 1) = ((
) mod (2p 1) (
mod (2p 1)
=(
by (2)
) mod (2p 1)
by relation in footnote1
= ((xa xb)2bp(
=0
1
Consider the congruence relation: (m o m ) mod n = ((m mod n) o (m mod n)) mod n, where o is +, , or *
1
2
1
2
2
Consider
the equation
2(a b)p 1 = (
Page 4 of 7
)(2p 1)]
PSU CMPSC 465 Spring 2013
Because we deduced earlier that (h(x) h(y)) mod (2p 1) = ((xa xb)2bp(2(a b)p 1)) mod (2p 1) and have shown here that
((xa xb)2bp(2(a b)p 1)) mod (2p 1) = 0, we can conclude (h(x) h(y)) mod (2p 1) = 0, and thus h(x) = h(y). So we have
proven that if we can derive string x from string y by permuting its characters, then x and y hash to the same value.
Examples of applications:
A dictionary which contains words expressed by ASCII code can be one of such example when each character of the
dictionary is interpreted in radix 28 = 256 and m = 255. The dictionary, for instance, might have words "STOP," "TOPS,"
"SPOT," "POTS," all of which are hashed into the same slot.
Exercise 11.3-4
Consider a hash table of size m = 1000 and a corresponding hash function
for
. Compute
the locations to which the keys 61, 62, 63, 64, and 65 are mapped.
Solution:
Page 5 of 7
Exercise 11.4-1
Consider inserting the keys 10, 22, 31, 4, 15, 28, 17, 88, 59 into a hash table of length m = 11 using open addressing with the
auxiliary hash function h(k) = k. Illustrate the result of inserting these keys using linear probing, using quadratic probing with
c1 = 1 and c2 = 3, and using double hashing with h(k) = k and h2(k) = 1 + (k mod (m 1)).
Solution:
Linear Probing
With linear probing, we use the hash function h(k, i) = (h'(k) + i) mod m = (k + i) mod m. Consider hashing each of the
following keys:
1)
2)
3)
4)
5)
6)
7)
8)
9)
Hashing 10:
h(10, 0) = (10 + 0) mod 11= 10. Thus we have T[10] = 10.
Hashing 22:
h(22, 0) = (22 + 0) mod 11 = 0. Thus we have T[0] = 22.
Hashing 31:
h(31, 0) = (31 + 0) mod 11 = 9. Thus we have T[9] = 31.
Hashing 4:
h(4, 0) = (4 + 0) mod 11 = 4. Thus we have T[4] = 4.
Hashing 15:
h(15, 0) = (15 + 0) mod 11 = 4, collision!
h(15, 1) = (15 + 1) mod 11 = 5. Thus we have T[5] = 15.
Hashing 28:
h(28, 0) = (28 + 0) mod 11 = 6. Thus we have T[6] = 28.
Hashing 17:
h(17, 0) = (17 + 0) mod 11 = 6, collision!
h(17, 1) = (17 + 1) mod 11 = 7. Thus we have T[7] = 17.
Hashing 88:
h(88, 0) = (88 + 0) mod 11 = 0, collision!
h(88, 1) = (88 + 1) mod 11 = 1. Thus we have T[1] = 88.
Hashing 59:
h(59, 0) = (59 + 0) mod 11 = 4, collision!
h(59, 1) = (59 + 1) mod 11 = 5, collision!
h(59, 2) = (59 + 2) mod 11 = 6, collision!
h(59, 3) = (59 + 3) mod 11 = 7, collision!
h(59, 4) = (59 + 4) mod 11 = 8. Thus we have T[8] = 59.
4
5
15
6
28
7
17
8
59
9
31
10
10
11
Quadratic Probing
With quadratic probing, and c1 = 1, c2 = 3, we use the hash function h(k, i) = (h'(k) + i + 3i2) mod m = (k + i + 3i2) mod m.
Consider hashing each of the following keys:
1)
2)
3)
4)
5)
6)
7)
Hashing 10:
h(10, 0) = (10 + 0 + 0) mod 11 = 10. Thus we have T[10] = 10.
Hashing 22:
h(22, 0) = (22 + 0 + 0) mod 11 = 0. Thus we have T[0] = 22.
Hashing 31:
h(31, 0) = (31 + 0 + 0) mod 11 = 9. Thus we have T[9] = 31.
Hashing 4:
h(4, 0) = (4 + 0 + 0) mod 11 = 4. Thus we have T[4] = 4.
Hashing 15:
h(15, 0) = (15 + 0 + 0) mod 11 = 4, collision!
h(15, 1) = (15 + 1 + 3) mod 11 = 8. Thus we have T[8] = 15.
Hashing 28:
h(28, 0) = (28 + 0 + 0) mod 11 = 6. Thus we have T[6] = 28.
Hashing 17:
Page 6 of 7
8)
9)
17
4
4
5
28
7
59
8
15
9
31
10
10
11
88
8
31
10
10
11
Doubling Hashing
With double hashing, we use the hash function:
h(k, i) = (h'(k) + ih2'(k)) mod m = (k + i(1 + (k mod (m 1)))) mod m.
Consider hashing each of the following keys:
1)
2)
3)
4)
5)
6)
7)
8)
9)
Hashing 10:
h(10, 0) = (10 + 0) mod 11 = 10. Thus we have T[10] = 10.
Hashing 22:
h(22, 0) = (22 + 0) mod 11 = 0. Thus we have T[0] = 22.
Hashing 31:
h(31, 0) = (31 + 0) mod 11 = 9. Thus we have T[9] = 31.
Hashing 4:
h(4, 0) = (4 + 0) mod 11 = 4. Thus we have T[4] = 4.
Hashing 15:
h(15, 0) = (15 + 0) mod 11 = 4, collision!
h(15, 1) = (15 + 1 * h2'(15)) mod 11 = 10, collision!
h(15, 2) = (15 + 2 * h2'(15)) mod 11 = 5. Thus we have T[5] = 15.
Hashing 28:
h(28, 0) = (28 + 0) mod 11 = 6. Thus we have T[6] = 28.
Hashing 17:
h(17, 0) = (17 + 0) mod 11 = 6, collision!
h(17, 1) = (17 + 1 * h2'(17)) mod 11 = 3. Thus we have T[3] = 17.
Hashing 88:
h(88, 0) = (88 + 0) mod 11 = 0, collision!
h(88, 1) = (88 + 1 * h2'(88)) mod 11 = 9, collision!
h(88, 2) = (88 + 2 * h2'(88)) mod 11 = 7. Thus we have T[7] = 88.
Hashing 59:
h(59, 0) = (59 + 0) mod 11 = 4, collision!
h(59, 1) = (59 + 1 * h2'(59)) mod 11 = 3, collision!
h(59, 2) = (59 + 2 * h2'(59)) mod 11 = 2. Thus we have T[2] = 59.
17
4
4
5
15
6
28
7
Page 7 of 7