Coding Theory, Cryptography and Cryptographic Protocols - Exercises With Solutions
Coding Theory, Cryptography and Cryptographic Protocols - Exercises With Solutions
!"#$%&'()+,-./012345<yA|
M ASARYK U NIVERSITY
FACULTY OF I NFORMATICS
B ACHELOR THESIS
Zuzana Kuklová
Hereby I declare, that this paper is my original authorial work, which I have worked out by
my own. All sources, references and literature used or excerpted during elaboration of this
work are properly cited and listed in complete reference to the due source.
ii
Acknowledgement
I would like to thank prof. RNDr. Jozef Gruska, DrSc. and Mgr. Lukáš Boháč for their inspir-
ing comments which have essentially contributed to fulfilling of the presented work.
I am obliged to my family for understanding and furtherance.
iii
Abstract
The main goal of this work is to present detailed solutions of exercises that have been sub-
mitted to students of the course Coding, cryptography and cryptographic protocols, given
by prof. RNDr. Jozef Gruska, DrSc. in 2006 as homeworks. This way a handbook of solved
exercises from coding theory and cryptography is created. Ahead of each set of new exer-
cises we include main concepts and results from the corresponding lecture that are needed
to solve exercises.
iv
Keywords
Coding theory, code, linear code, cyclic code, cryptography, cryptosystem, cryptoanalysis,
secret key cryptography, public key cryptography, digital signature, subliminal channel, el-
liptic curve, factorization, prime recognition, identification, authentication, bit commitment,
zero knowledge proof
v
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1 Basics of Coding Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1 Definition of Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Equivalence of Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Properties of Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Linear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1 Definition of Linear Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Equivalence of Linear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Dual Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Encoding with Linear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Decoding of Linear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Hamming Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.7 Properties of Linear Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 Cyclic Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1 Definition of Cyclic Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Algebraic Characterization of Cyclic Codes . . . . . . . . . . . . . . . . . . . . 22
3.3 Generator Matrix, Parity Check Matrix and Dual Code . . . . . . . . . . . . . 23
3.4 Encoding with Cyclic Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5 Hamming Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4 Secret Key Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1 Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Cryptoanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Secret Key Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.1 Caesar Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.2 Polybious Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.3 Hill Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.4 Affine Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.5 Playfair Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.6 Vigenere and Autoclave Cryptosystems . . . . . . . . . . . . . . . . . . 33
4.3.7 One time pad Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4 Perfect Secret Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Public Key Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1 Diffie-Hellman Key Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2 Blom’s Key Predistribution Protocol . . . . . . . . . . . . . . . . . . . . . . . . 39
1
5.3 Cryptography and Computational Complexity . . . . . . . . . . . . . . . . . . 40
5.4 RSA Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.5 Rabin-Miller’s Prime Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6 Other Public Key Cryptosystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.1 Rabin Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 ElGamal Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7 Digital Signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.1 Digital Signature Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 Attacks on Digital Signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.3 RSA Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.4 ElGamal Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.5 Digital Signature Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.6 Ong-Schnorr-Shamir Subliminal Channel Scheme . . . . . . . . . . . . . . . . 51
7.7 Lamport Signature Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8 Elliptic Curve Cryptography and Factorization . . . . . . . . . . . . . . . . . . . . 56
8.1 Elliptic Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.2 Addition of Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.3 Elliptic Curves over a Finite Field . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.4 Discrete Logarithm Problem for Elliptic Curves . . . . . . . . . . . . . . . . . . 57
8.5 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.5.1 Factorization with Elliptic Curves . . . . . . . . . . . . . . . . . . . . . 57
8.5.2 Pollard’s Rho Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9 User Identification, Message Authentication and Secret Sharing . . . . . . . . . . 64
9.1 User Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.2 Message Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.3 Secret Sharing Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.3.1 Shamir’s (n, t)-secret sharing scheme . . . . . . . . . . . . . . . . . . . 65
9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
10 Bit Commitment Protocols and Zero Knowledge Proofs . . . . . . . . . . . . . . . 70
10.1 Bit Commitment Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
10.2 Oblivious Transfer Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
10.3 Zero Knowledge Proof Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . 71
10.4 3-Colorability of Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
10.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2
Introduction
The main goal of this work is to present detailed solutions of exercises that have been sub-
mitted to students of the course Coding, cryptography and cryptographic protocols, given
by prof. RNDr. Jozef Gruska, DrSc. in 2006 as homeworks. The authors of exercises are Mgr.
Lukáš Boháč, RNDr. Jan Bouda, Ph.D., Mgr. Ivan Fialı́k and Mgr. Josef Šprojcar.
This way we create a handbook of solved exercises from coding theory and cryptography
that could be useful to the future students of the above course. Ahead of each set of exercises
we include main concepts and results from the corresponding lecture that are needed to
solve exercises. The main source of solutions presented here are solutions submitted by the
students of the above course. The solutions were adopted and/or modified to achieve a
uniform presentation of the exercises and of their solutions.
For some of the exercises we present not only one, but several solutions in the case suf-
ficiently different approaches have been used in the submitted solutions. Some of the solu-
tions are newly created. The authors of solutions are cited. The solutions, where no author is
stated, were created or submitted by myself.
Ciphers and codes have been a part of human history since the time of Egyptian pharaohs.
They arose from the requirement to protect secrets and messages against aliens and enemies.
People were trying to protect their own secrets, as hard as they were trying to discover se-
crets of others. Their competition led up to invent better and better ciphers and codes that
cannot be so easily broken through. And this is how the cryptography progresses till now:
code makers are inventing new more sophisticated and secure ciphers and codes and code
breakers try to crack them. The struggle between the code makers and the code breakers
stood in the background of various historical events – it decided battles, revolts and human
lives.
Today, encipherment, coding and authentication are an inseparable part of our daily life.
Therefore, it is very important to know the history of ciphers, how they work and where
are their weaknesses. The basics can be obtained in the course Coding, Cryptography and
Cryptographic Protocols, taught at the Faculty of Informatics every year by prof. RNDr. Jozef
Gruska, DrSc.
The bibliography I used as a source of information for my work and which can be useful
for everyone interested in more detailed information about studied problems is listed at the
end of the work. Simultaneously, there are listed some interesting web pages, where can be
found more about problems, as well as some useful tools for solving exercises.
3
Chapter 1
Coding theory has developed methods of protecting information against noise. Without cod-
ing theory and error correcting codes, there would be no deep space pictures, no satellite TV,
no CD, no DVD and many more. . .
1. h(x, y) = 0 ⇔ x = y
2. h(x, y) = h(y, x)
h(C) is the smallest number of bits needed to change one codeword into another. Code C
can detect up to s errors if h(C) ≥ s + 1. Code C can correct up to t errors if h(C) ≥ 2t + 1.
An (n, M, d)-code C is a code such that n is the length of codewords, M is the number of
codewords and d is the minimum distance of C. A good (n, M, d) code has small n and large
M and d.
The main coding problem is to optimize one of the parameters n, M , d for given values
of the other two. Aq (n, d) is the largest M such that there is a q-ary (n, M, d)-code. It holds
that
1. Aq (n, 1) = q n
2. Aq (n, n) = q
Two q-ary codes are equivalent if one can be obtained from the other by a combination of
following operations:
4
1. B ASICS OF C ODING T HEORY
Fqn is a set of all words of length n over alphabet {0, 1, . . . q − 1}. For any codeword u ∈ Fqn
and any integer r ≥ 0 the sphere of radius r and center u is defined as
words.
The sphere packing bound: If C is a q-ary (n, M, 2t + 1)-code, then
t
X n
M· (q − 1)i ≤ q n . (1.1)
i
i=0
A code which achieves the sphere packing bound (a code that satisfies the equality) is called
a perfect code.
Singleton’s bound: If C is a q-ary (n, M, d)-code, then
M ≤ q n−d+1 . (1.2)
and therefore
qn
Aq (n, d) ≥ Pd−1 n
.
j
j=0 j (q − 1)
1.4 Entropy
Let X be a random variable (source) which takes a value x with probability p(x). The entropy
of X is defined by X
S(X) = − p(x) lg p(x) (1.4)
x
and it is considered to be the information content of X. Shannon’s noiseless coding theorem
says that in order to transmit n values of X we need to use nS(X) bits. More exactly, we
cannot do better and we should reach the bound nS(X) as close as possible.
5
1. B ASICS OF C ODING T HEORY
1.5 Exercises
Exercise 1.1
Determine Aq (n, d) and write or describe the corresponding code that achieves the upper
bound.
Solution 1.1.1
1. A2 (8, d)
(a) d = 1. A2 (8, 1) = 28 . Code C contains all binary words of the length eight.
(b) d = 2. A2 (8, 2) = A2 (7, 1) = 27 . Code C contains all binary words of the length
seven with the parity bit added.
2. A2 (n, 4)
3. Aq (4, 3)
Exercise 1.2
Let q > 1. What is the relation (≤, ≥ or =) between
6
1. B ASICS OF C ODING T HEORY
Solution 1.2.1
1. Aq (2n, d) ≥ Aq (n, d)
Let Aq (2n, d) = M1 and Aq (n, d) = M2 . We need to show that for each q, n and d it holds
M1 ≥ M2 . To do that, we need to determine which code contains more codewords. We
compute it using the Singleton bound (1.2), page 5:
M1 ≤ q 2n−d+1 ,
M2 ≤ q n−d+1 .
and therefore Aq (2n, d) contains q n codewords more then Aq (n, d) and hence Aq (2n, d) ≥
Aq (n, d).
We can see that if we have two codes of different length with the same minimum dis-
tance, the code with longer codewords contains more codewords then the other code.
2. Aq (n, d) and Aq (n + 2, 2d) are incomparable as we can see in the following examples:
if q = 2, n = 2 and d = 1 then A2 (2, 1) = 22 < A2 (4, 2) = A2 (3, 1) = 23 ,
if n = 2 and d = 2 then Aq (2, 2) = q = Aq (4, 4) = q,
if q = 2, n = 4 and d = 2 then A2 (4, 2) = A2 (3, 1) = 23 > A2 (6, 4) = A2 (5, 3) = 4.
Exercise 1.3
Consider the binary erasure channel which has two inputs (0 or 1) and three outputs (0, 1
or ?). The symbol is correctly received with probability 1 − p and erased with probability p.
Erasure is indicated by receiving the symbol ’?’.
Calculate the probability that the received word is decoded incorrectly and the proba-
bility of error detection.
2. Consider a code C with the minimum distance h(C) = d. How many erasures can the
code C detect and correct?
3. Consider a binary channel that has both erasures and errors. Give the lower bound for
the minimum Hamming distance for a code capable of correcting all combinations of
e erasures and t errors.
7
1. B ASICS OF C ODING T HEORY
Solution 1.3.1
1. The received word is decoded incorrectly only if it contains two or more question
marks. The probability of an erroneous decoding is:
We can detect every erasure because the question mark is not an element of the code
alphabet. And we can correct one erasure in the codeword. So the probability that the
received word is decoded correctly is:
(1−p)3 +3·p·(1−p)2 = 1−3p+3p2 −p3 +3p−6p2 +3p3 = 1−3p2 +2p3 = 1−(3p2 −2p3 )
2. Code C can detect every erasure – in this case we receive a symbol that cannot be sent.
Code C can correct up to d − 1 erasures in a codeword. When receiving a word with
e ≤ d − 1 erased symbols, we delete positions where we received a question mark
in all codewords of C. This way we get a new code C 0 . The length of the codewords
decreases from n to n − e and the d(C) decreases to d − e ≥ 1. That means, that there
is still the Hamming distance h(x, y) ≥ 1 of each two words x, y of the code C 0 . So we
can decode the received word correctly.
3. The minimum distance for a code C capable of correcting all combinations of e erasures
and t errors is d(C) = 2t + e + 1. When there are some (less then or equal to e) erased
symbols, we transform the code C to code C 0 the same way as it was described above
and d(C 0 ) ≥ 2t + 1. According to the definition of Hamming distance, we can correct
up to t errors in the codeword.
Exercise 1.4
You are given two dices with 6 faces. Design a binary Huffman code for encoding the sum
of two dices. Compare efficiency of the proposed code with Shannon’s entropy.
Solution 1.4.1
All the possible sums of two dices and their probabilities are written in the Table 1.1.
At the Figure 1.1 there you can see how to design a Huffman code for the given data. (For
short, there is written 1 there instead of 1/36 and so on.)
In the Table 1.2, there are written the possible values and their codes. We can see, that it
is a prefix code.
We calculate the Shannon’s entropy (1.4) as follows:
X
S(X) = − p(x) · lg p(x) ≈ 3.2744
x
By Shannon’s theorem, we need 3.2744 bits in average per message. Now, we calculate the
efficiency E of our code:
X 3+4+5+6+5+4 2+3+2 1+1
E= p(x)|code(x)| = 3 · +4· +5· ≈ 3.3056
x
36 36 36
By using our code we need circa 0.03 bits per message (sum of two dices) more.
8
1. B ASICS OF C ODING T HEORY
x p(x)
2 1/36
3 2/36
4 3/36
5 4/36
6 5/36
7 6/36
8 5/36
9 4/36
10 3/36
11 2/36
12 1/36
Σ code
2 00101
3 1000
4 000
5 011
6 110
7 111
8 101
9 010
10 1001
11 0011
12 00100
9
1. B ASICS OF C ODING T HEORY
Exercise 1.5
You have found the belt with an ornament displayed at the Figure 1.2. It seems that the
ornament is related to coding theory. Decode the hidden message.
NRZI (Non Return to Zero, Inverted) signal encoding was used. A change of the level en-
codes 1, staying on the level encodes 0. The bit-string encodes the message CODE NRZI (8bit
ASCII code) as you can see at the Figure 1.3.
Exercise 1.6
A single character was encoded into the following long message. Decode.
012221102011200210110121222012001211122201
Solution 1.6.1
The message is 42 bits long and there should be hidden only one single character. Therefore
there is a strong probability that it is a graphic cipher. Our task is to form the message into
a table and look for the hidden letter. The character is formed by twos when we put the
message into a table with six rows and seven columns. And here we can see the letter G:
10
1. B ASICS OF C ODING T HEORY
0 1 2 2 2 1 1
0 2 0 1 1 2 0
0 2 1 0 1 1 0
1 2 1 2 2 2 0
1 2 0 0 1 2 1
1 1 2 2 2 0 1
2 2 2
2 2
2
2 2 2 2
2 2
2 2 2
11
Chapter 2
Linear Codes
Linear codes are important because they have very concise description, very nice properties,
very easy encoding and in principle quite easy decoding.
Linear codes are special sets of words of length n over an alphabet {0, . . . , q − 1} where q is
a power of prime.
A subset C ⊆ V (n, q) is called a linear code if
1. for all u, v ∈ C: u + v ∈ C;
where GF (q) is Galois field, the set {0, . . . , q − 1} with operations + and · taken modulo q,
where q is a prime.
We can also say that a subset C ⊆ V (n, q) is a linear code if one of the following conditions
are satisfied:
12
2. L INEAR C ODES
Two linear codes over GF (q) are equivalent if one can be obtained from the other by the
following operations:
1. permutation of the positions of the code;
2. multiplication of symbols appearing in a fixed position by a nonzero scalar.
Two n × k matrices generate equivalent linear [n, k]-code over GF (q) if one matrix can be
obtained from the other by a sequence of the following operations:
1. permutation of the rows;
2. multiplication of a row by a nonzero scalar;
3. addition of one row to another;
4. permutation of columns;
5. multiplication of a column by a nonzero scalar.
13
2. L INEAR C ODES
Suppose that C is a linear [n, k]-code over GF (q) and a ∈ V (n, q). The set
a + C = {a + x|x ∈ C}
The decoder will fail to detect transmission errors if the received word y is a codeword
different from the sent codeword x. Let C be a binary [n, k]-code and Ai be the number of
codewords of C of weight i. The probability Pundetected (C) that a an incorrect codeword is
received is
n
X
Pundetected (C) = Ai pi (1 − p)n−i .
i=0
If H is a parity check matrix of a linear [n, k]-code C, then S(y) is called the syndrom of
y, for each y ∈ V (n, q). The syndrom can be calculated as follows:
S(y) = yH T . (2.1)
Two words have the same syndrom if and only if they are in the same coset.
Syndrom decoding: When a word y is received, compute S(y), locate the coset leader l
with the same syndrom and decode y as y − l.
An important family of simple linear codes are Hamming codes. Let r be an integer and H be
a r×(2r −1) matrix whose columns are nonzero distinct words from V (r, 2). The code having
H as its parity check matrix is called binary Hamming code and denoted as Ham(r, 2).
The Hamming code Ham(r, 2) is a linear [2r − 1, 2r − 1 − r]-code, it has the minimum
distance 3 and it is a perfect code. Coset leaders are words of weight less then or equal to 1.
The syndrom of the word z with one at the ith position and zeroes otherwise is the transpose
of the ith column of matrix H.
Decoding the Hamming codes for the case that columns of H are arranged in the order
of increasing binary numbers the columns represent: when received word y compute S(y),
if S(y) = 0 then y is assumed to be the codeword sent, if S(y) 6= 0 then assuming a single
error, S(y) gives the binary position of the error.
14
2. L INEAR C ODES
n − k ≥ d − 1.
2.8 Exercises
Exercise 2.1
Decide which of the following codes is linear. Find a generator matrix in standard form for
linear codes.
2. 6-ary code C2 = {201, 202, 231, 402, 403, 432, 003, 004, 033, 204, 205, 234,
405, 400, 435, 000, 001, 030, 404, 005, 200, 401, 002, 203, 433, 034, 235, 430,
031, 232, 035, 230, 431, 032, 233, 434}
3. Ternary code C3 = {000, 201, 111, 021, 012, 120, 102, 222, 210}
Solution 2.1.1
1. 5-ary code C1 = {21234, 42413, 13142, 34321, 00000} is linear code over GF (5) because
for each u, v ∈ C1 : u + v ∈ C1 and for each a ∈ GF (5), u ∈ C1 : au ∈ C1 . The generator
matrix G is:
2 1 2 3 4
4 2 4 1 3
1 3 1 4 2 ; 1 3 1 4 2 = G
3 4 3 2 1
0 0 0 0 0
3. Ternary code C3 = {000, 201, 111, 021, 012, 120, 102, 222, 210} is linear code over GF (3)
because for each u, v ∈ C3 : u + v ∈ C3 and for each a ∈ GF (3), u ∈ C3 : au ∈ C3 . The
15
2. L INEAR C ODES
Exercise 2.2
Let C be a binary code of length 6 such that for every x1 x2 x3 ∈ {0, 1}3 : x1 x2 x3 x4 x5 x6 ∈ C if
and only if x4 = x1 + x2 , x5 = x2 + x3 and x6 = x1 + x2 + x3 . Show that C is a linear code.
Find a generator matrix and a parity check matrix for C.
Consider binary code C of length 6 such that for every x1 x2 x3 ∈ {0, 1}3 : x1 x2 x3 x4 x5 x6 ∈
C ⇐⇒ x4 = x1 + x2 , x5 = x2 + x3 , x6 = x1 + x2 + x3 . In the next table are shown all the
codewords from the code C:
x1 x2 x3 x1 x2 x3 x4 x5 x6
0 0 0 000000
0 0 1 001011
0 1 0 010111
0 1 1 011100
1 0 0 100101
1 0 1 101110
1 1 0 110010
1 1 1 111001
z4 = x4 + y4 = x1 + x2 + y1 + y2 = z1 + z2
z5 = x5 + y5 = x2 + x3 + y2 + y3 = z2 + z3
z6 = x6 + y6 = x1 + x2 + x3 + y1 + y2 + y3 = z1 + z2 + z3
16
2. L INEAR C ODES
Code C consists of 8 codewords thus its dimension must be 3. The generator matrix for
code C is:
0 0 0 0 0 0
0 0 1 0 1 1
0 1 0 1 1 1
1 0 0 1 0 1
0 1 1 1 0 0
; 0 1 0 1 1 1 = G.
1 0 0 1 0 1
0 0 1 0 1 1
1 0 1 1 1 0
1 1 0 0 1 0
1 1 1 0 0 1
The parity check matrix for code C is:
1 1 0 1 0 0
H = 0 1 1 0 1 0 .
1 1 1 0 0 1
Exercise 2.3
Find examples of a linear self-dual code of length 3 and 4. If such code does not exist, prove
it.
Solution 2.3.1
There is no self-dual code C of length 3 because C must be a [3, k]-code where k ∈ {1, 2, 3}.
Code C ⊥ must be a [3, 3 − k]-code. But there is no k such that k = 3 − k.
The code C = {0000, 1010, 0101, 1111} is a self-dual code. The generator matrix for code
C is:
1 0 1 0
G= .
0 1 0 1
The parity check matrix H for code C is equal to matrix G. Because the generator matrix G⊥
for the dual code C ⊥ is the parity check matrix for code C, we have G = H = G⊥ . So we can
see that the code C is self-dual.
17
2. L INEAR C ODES
Exercise 2.4
Find a generator matrix and a parity check matrix for ISBN code.
Solution 2.4.1
The ISBN code is not a linear code unless we allow all position of the code to have a value
from Z11 – strictly, only the last digit can be X.
The ISBN code is a 11-ary code of length 10. We use it to encode massages of length 9, so
its dimension must be 9. Basically, encoding is a process of calculating the 10th position of
the given message so that the following equality is fulfilled:
10
X
i · xi ≡ 0 (mod 11)
i=1
We can see, that the generator matrix for ISBN code is:
1 0 0 0 0 0 0 0 0 1
0 1 0 0 0 0 0 0 0 2
0 0 1 0 0 0 0 0 0 3
0 0 0 1 0 0 0 0 0 4
G = 0 0 0 0 1 0 0 0 0 5 .
0 0 0 0 0 1 0 0 0 6
0 0 0 0 0 0 1 0 0 7
0 0 0 0 0 0 0 1 0 8
0 0 0 0 0 0 0 0 1 9
Exercise 2.5
Solution 2.5.1
According to the Corollary ”If C is a linear code, then C has minimum weight d, if d is the
largest number such that every d − 1 columns of any parity check matrix of C are indepen-
dent.” we can see, that the minimum distance d of Hr is 3. Because the columns of the parity
check matrix for a Hr consists of all non zero distinct words from V (r, 2), every two columns
are independent. When we have words 01...1, 10...0, 1...1 of length r, we can see that the sum
of the first and the second word is the third word. That means that not every 3 columns are
independent and therefore the largest d is 3.
The parity check matrix H for Hr is a r × 2r − 1 matrix, hence the generator matrix for Hr
is a 2r − 1 − r × 2r − 1 − r + r matrix. That means that Hr is a [2r − 1, 2r − 1 − r]-code. Since
r
the dimension of the code Hr is 2r − 1 − r, the number of codewords is 22 −1−r . We can say
r
that Hr is a (2r − 1, 22 −1−r , 3)-code.
18
2. L INEAR C ODES
We know that a code is perfect if it achieves the sphere packing bound (1.1), page 5. That
means that the following equality must be satisfied:
1 r !
2r −1−r
X 2 −1 r
2 (2 − 1) = 22 −1
i
i
i=0
And we have:
r r
2r −1−r 2 −1 2 −1 r r r
2 + (2 − 1) = 22 −1−r (1 + 2r − 1) = 22 −1−r · 2r = 22 −1
0 1
Exercise 2.6
Let C = {00000, 10001, 01010, 11011, 00100, 10101, 01110, 11111} be a binary linear code. List
all the cosets of C. Compute a parity check matrix for C. Use syndrom decoding to decode
words 00111 and 01011.
Solution 2.6.1
Code C = {00000, 10001, 01010, 11011, 00100, 10101, 01110, 11111} is a binary linear code be-
cause the sums of any two or more words from C falls into C. The generator matrix G of
code C is
0 0 0 0 0
1 0 0 0 1
0 1 0 1 0
1 0 0 0 1
1 1 0 1 1
; 0 1 0 1 0 = G.
0 0 1 0 0
0 0 1 0 0
1 0 1 0 1
0 1 1 1 0
1 1 1 1 1
The dimension of the code is 3 and the parity check matrix H of code C is
0 1 0 1 0
H= .
1 0 0 0 1
There are no other cosets because there is only 25 binary words of length 5 and all of them
are listed above.
We determine the syndrom S(y) of word y as shown in (2.1), page 14. The syndromes of
coset leaders are following:
19
2. L INEAR C ODES
• Let y = 01011 then z = S(y) = 01. The sent word was y−I(z) = 01011−00001 = 01010.
Exercise 2.7
Let C be a binary linear code. Show that either all the codewords of C have even weight or
exactly half of the codewords have even weight.
Solution 2.7.1
Let u and v be two binary words form V (r, 2). If w(u) and w(v) are both odd or both even,
the weight of their sum w(u + v) is even. If w(u) is even and w(v) is odd (or vice versa), the
weight of their sum w(u + v) is odd.
That means that if there is no word u ∈ C with odd weight, all words from C must have
even weight.
In case that there is a word u ∈ C with odd weight, the sum x + u must fall into C for
each x ∈ C because C is a linear code. Now, we can define a relation α over the codewords
from C so that (x, y) ∈ α if x + y = u. Since C is a binary code, α is symmetric relation, thus
(x, y) ∈ α ⇒ (y, x) ∈ α. Because w(u) is odd then one of w(x) and w(y) must be odd and the
other must be even. We can easily see that (x, y) ∈ α only if x 6= y. In case that x = y then
x + y = x + x = 0 which is contradiction because w(0) is not odd.
Because α is defined over all words from C, and two words are in relation α only if one is
even and the other is odd, we can see that exactly one half of the codewords has odd weight
and the other has even weight.
20
2. L INEAR C ODES
• If exactly one of the vectors has odd weight, vector o, then the remaining k − 1 vectors
generate a subspace E with 2k−1 elements, whose weight is even, from (1). The coset
E + o has 2k−1 elements with odd weight, from (2). The union of these two sets gives
us the whole subspace and exactly half the vectors has odd weight and the other half
has even weight.
• If there are more vectors of odd weight in the generator matrix, we can get the previous
case by choosing one odd vector and adding it to all the remaining odd vectors. From
(3) follows that there is only one vector with odd weight in the generator matrix.
Exercise 2.8
Let C be a binary linear code of length n. Let ci denote the number of words of weight i in
C. Suppose that cn = 1. Show that ci = cn−i for i ∈ {0, 1, . . . , n}.
21
Chapter 3
Cyclic Codes
Cyclic codes are of interest and importance because they posses rich algebraic structure that
can be utilized in a variety of ways. They have extremely concise specifications, they can be
efficiently implemented using shift registers. Many practically important codes are cyclic. In
order to specify a binary cyclic code with 2k codewords of length n it is sufficient to write
down only one codeword of length n.
A code C is cyclic if
1. C is a linear code;
Comparing with linear codes, the cyclic codes are quite rare. For any field F and any
integer n ≥ 3 there are always the following trivial cyclic codes of length n over F :
For some cases, there are no other cyclic codes then the four trivial cyclic codes.
a0 a1 . . . an−1
a0 + a1 x + a2 x2 + · · · + an−1 xn−1 .
22
3. C YCLIC C ODES
where Rn is a field such that Rn = Fq [x]/(xn − 1), where Fq [x] denotes the set of all polyno-
mials over GF (q).
For any f (x) ∈ Rn the set
• g(x) is a factor of xn − 1.
If for a cyclic code C it holds
C = hg(x)i,
then g(x) is called the generator polynomial for code C.
The task of finding all cyclic codes of given length n is equal to the task of finding all
factors of polynomial xn − 1.
g(x) = g0 + g1 x + · · · + gr xr .
g0 g1 g2 · · · gr ···
0 0 0 0
0 g0 g1 g2 · · · gr 0 0 ··· 0
G = 0 0 g0 g1 g2 · · · gr 0 ··· 0 .
. ..
.. .
0 0 ··· 0 0 ··· 0 g0 · · · gr
Let C be a cyclic [n, k]-code with the generator polynomial g(x) of order n−k. Polynomial
g(x) is a factor of xn − 1. Hence
xn − 1 = g(x)h(x)
for some polynomial h(x) of degree k. Polynomial h(x) is called the check polynomial of
code C.
Let C be a cyclic code over Rn with a generator polynomial g(x) and a check polynomial
h(x). Then any polynomial c(x) ∈ Rn is a codeword of C if c(x)h(x) ≡ 0 (mod xn − 1).
Suppose C is a cyclic [n, k]-code with the check polynomial h(x) = h0 + h1 x + · · · + hk xk ,
then
1. a parity check matrix for code C is
hk hk−1 · · · h0 0 · · · 0
0 hk · · · h1 h0 · · · 0
H= . .. ;
.. .
0 0 ··· 0 hk · · · h0
23
3. C YCLIC C ODES
h(x) = hk + hk−1 x + · · · + h0 xk
Encoding using a cyclic code can be done by a multiplication of two polynomials – a message
polynomial and the generator polynomial of the cyclic code.
Let C be a cyclic [n, k]-code over any field F with the generator polynomial g(x) = g0 +
g1 x + · · · + gr xr of degree r = n − k. If a message vector m is represented by a polynomial
m(x) of degree k and m is encoded as
c = mG,
c(x) = m(x)g(x).
Let r be a positive integer and H be a r × (2r − 1) matrix whose columns are distinct nonzero
vectors from V (r, 2). Then the code having H as its parity check matrix is called binary
Hamming code and denoted as Ham(r, 2).
The binary Hamming code Ham(r, 2) is equivalent to a cyclic code.
If p(x) is an irreducible polynomial of degree r such that x is a primitive element of the
field F [x]/p(x), then p(x) is called a primitive polynomial.
If p(x) is a primitive polynomial over GF (2) of degree r, then the cyclic code hp(x)i is the
Hamming code Ham(r, 2).
3.6 Exercises
Exercise 3.1
Let us consider the following definition of equivalence of two binary codes: Two binary
codes are equivalent if and only if they can be transformed to each other by permutation of
positions and addition of a constant vector.
Is this definition correct? Prove or disprove.
24
3. C YCLIC C ODES
Exercise 3.2
Exercise 3.3
Use the Plotkin bound and properties of A(n, d) to prove that A2 (2d, d) ≤ 4d.
Plotkin bound: Let C be a binary code with minimum distance d and length n. If 2d > n
then
d
A(n, d) ≤ 2 .
2d − n
Solution 3.3.1
First we need to show that Aq (n, d) ≤ qAq (n−1, d). Let Aq (n, d) = M1 and qAq (n−1, d) = M2 .
From the Singleton’s bound (1.2) (page 5) we have
M1 ≤ q n−d+1 ,
M2 ≤ qq n−1−d+1 = q n−d+1 .
We can see that the values of Mi are lower then q n−d+1 . Now, we need to determine the
minimal values of Mi . To do that, we use the lower bound (1.3) (page 5):
qn
M1min = Pd−1 n
,
(q − 1)j
j=0 j
qq n−1
M2min = Pd−1 n−1
.
j
j=0 j (q − 1)
It is obvious that M1min ≤ M2min . In total, we get that M1 ≤ M2 , what is equal to Aq (n, d) ≤
qAq (n − 1, d).
Let’s assume that q = 2 and n = 2d. Then we have A2 (2d, d) ≤ 2A2 (2d − 1, d). Because
2d > 2d − 1 we can use the Plotkin bound and we get
d
A2 (2d − 1, d) ≤ 2 = 2d.
2d − (2d − 1)
25
3. C YCLIC C ODES
Exercise 3.4
Determine whether the following codes are cyclic. Explain your reasoning.
Solution 3.4.1
1. Ternary code C1 is cyclic code. Let c1 = 0000, c2 = 1212 and c3 = 2121, then:
c1 + c2 = c2 , c1 + c3 = c3 , c2 + c3 = c1 , c2 + c2 = c3 , c3 + c3 = c2
c1 → c1 , c2 → c3 → c2
C1 = h1 − x + x2 − x3 i
2. Ternary code C2 is not cyclic. For example, words 11100 and 00111 both fall into C2 ,
because w(11100) = w(00111) ≡ 0 (mod 3). Their sum must also fall into C2 in case it
is linear, but w(11100 + 00111) = w(11211) ≡ 2 (mod 3). That is contradiction so code
C2 is not cyclic.
3. Ternary code C3 is a cyclic code. We know, that a code is cyclic, if it is a linear code and
if any cyclic shift of a codeword is also a codeword.
Firstly, let x = x1 x2 x3 x4 x5 and y = y1 y2 y3 y4 y5 be words from C3 . Then x1 + x2 +
x3 + x4 + x5 ≡ 0 (mod 3) and y1 + y2 + y3 + y4 + y5 ≡ 0 (mod 3), that means that
(x1 +x2 +x3 +x4 +x5 )+(y1 +y2 +y3 +y4 +y5 ) ≡ 0 (mod 3) and because of commutativity
and associativity of addition we get (x1 +y1 )+(x2 +y2 )+(x3 +y3 )+(x4 +y4 )+(x5 +y5 ) ≡
0 (mod 3). And so the word x + y = (x1 + y1 )(x2 + y2 )(x3 + y3 )(x4 + y4 )(x5 + y5 ) falls
into C3 .
Let x = x1 x2 x3 x4 x5 be a word from code C3 and a be a scalar < 3. We have x1 +x2 +x3 +
x4 +x5 ≡ 0 (mod 3) after multiplying this equality by a we get a(x1 +x2 +x3 +x4 +x5 ) ≡
a0 (mod 3) because of distributivity we get ax1 + ax2 + ax3 + ax4 + ax5 ≡ 0 (mod 3)
hence the word ax = (ax1 )(ax2 )(ax3 )(ax4 )(ax5 ) ∈ C3 .
Secondly, let x = x1 x2 x3 x4 x5 be a word from code C3 , then x1 + x2 + x3 + x4 + x5 ≡ 0
(mod 3). And because of commutativity of addition we get x5 + x1 + x2 + x3 + x4 ≡ 0
(mod 3), that means that the word x5 x1 x2 x3 x4 falls into C3 too.
4. The 7-ary code C4 is not cyclic. We can easily see that the word 20001 is a codeword
from C4 because 2 + 0 + 0 + 0 + 5 ≡ 0 (mod 7). If code C4 is cyclic, then any cyclic
shift of a codeword should be a codeword too. But the word 12000 is not a codeword
because 1 + 4 + 0 + 0 + 0 ≡ 5 (mod 7). Hence the code C4 cannot be a cyclic code.
Exercise 3.5
Let C1 , C2 be q-ary cyclic codes of length n with generator polynomials f1 (x), f2 (x). Show
that the code C3 = C1 ∩ C2 is also cyclic. Find its generator polynomial.
26
3. C YCLIC C ODES
Solution 3.5.1
Let x and y be words from C3 . That means that x, y ∈ C1 and x, y ∈ C2 . Because both C1 and
C2 are cyclic, all cyclic shifts of x and y and their sums falls into C1 so as into C2 . Hence they
fall into C3 too. And that means that C3 is a cyclic code too.
Let f1 (x) and f2 (x) be the generator polynomials of C1 and C2 , then C1 = hf1 (x)i and
C2 = hf2 (x)i. We know that hf (x)i = {r(x)f (x)|r(x) ∈ Rn } hence: hf1 (x)i = {r(x)f1 (x)|r(x) ∈
Rn } and hf2 (x)i = {s(x)f2 (x)|s(x) ∈ Rn } (multiplication is taken modulo xn − 1) because
f1 (x), f2 (x) ∈ Rn , there must be some r(x) = f2 (x) and some s(x) = f1 (x). That is the way
how we get the generator polynomial of C3 , it is the polynomial lcm(f1 (x), f2 (x)) (modulo
xn − 1).
Exercise 3.6
2. How many different binary cyclic [65, 36] codes are there?
Solution 3.6.1
With the help of website http://www.quickmath.com/ [6] and its factorizing tool we can
get the following results:
1. There are only four trivial binary cyclic codes of length 19. Firstly, it is written in the
materials, secondly, x19 − 1 = (x + 1)(x18 + x17 + x16 + x15 + x14 + x13 + x12 + x11 +
x10 + x9 + x8 + x7 + x6 + x5 + x4 + x3 + x2 + x + 1) so the according cyclic codes are:
2. The factors of polynomial x65 −1 are (x+1)(x4 +x3 +x2 +x+1)(x12 +x8 +x7 +x6 +x5 +
x4 +1)(x12 +x10 +x7 +x6 +x5 +x2 +1)(x12 +x10 +x9 +x8 +x6 +x4 +x3 +x2 +1)(x12 +x11 +
x9 +x7 +x6 +x5 +x3 +x+1)(x12 +x11 +x10 +x9 +x8 +x7 +x6 +x5 +x4 +x3 +x2 +x+1).
The binary cyclic [65, 36]-code is generated by polynomial of degree 65 − 36 = 29 that
are factors of x65 − 1. Since 29 = 1 + 4 + 12 + 12 we have to choose two different
polynomials of degree 12, so there are 52 = 10 possibilities how can we choose them.
That means that there are 10 different binary cyclic [65, 36]-codes.
3. The factors of polynomial x65 − 1 are written above and we can see that we get all the
trivial cyclic codes, one [65, 61]-code, one [65, 60]-code, five [65, 53]-codes, five [65, 52]-
codes, five [65, 49]-codes, five [65, 48]-codes, ten [65, 41]-codes, ten [65, 40]-codes, ten
[65, 37]-codes, ten [65, 36]-codes, ten [65, 29]-codes, ten [65, 28]-codes, ten [65, 25]-codes,
ten [65, 24]-codes, five [65, 17]-codes, five [65, 16]-codes, five [65, 13]-codes, five [65, 12]-
codes, one [65, 5]-code and one [65, 4]-code. But there is no [65, 20]-code at all.
27
3. C YCLIC C ODES
Exercise 3.7
Consider the polynomial g(x) = x3 +x+1. Show that there is a cyclic code C of length 8 over
F3 such that g(x) is its generator polynomial. Find the generator polynomial of the code C ⊥ .
Solution 3.7.1
All cyclic codes of length 8 over field F3 are generated by the factors of the polynomial x8 −1.
If the polynomial g(x) = x3 + x + 1 is a factor of x8 − 1 then it generates a [8, 5]-cyclic code.
First, we need to factorize the polynomial x8 − 1: x8 − 1 = (x + 1)(x − 1)(x2 + x − 1)(x4 −
x3 − x − 1). We can see that there are two different [8, 5]-codes, there are their generator
polynomials:
• (x + 1)(x2 + x − 1) = x3 − x2 − 1
• (x − 1)(x2 + x − 1) = x3 + x + 1 = g(x)
Now it is obvious that g(x) generates a cyclic code of length 8 over F3 . Since x8 − 1 =
g(x)h(x), the check polynomial h(x) = (x + 1)(x4 − x3 − x − 1) = x5 − x3 − x2 + x − 1. Hence
the code C ⊥ is generated by the reciprocal polynomial h(x) = −x5 + x4 − x3 − x2 + 1.
Exercise 3.8
Decide correctness of the following statement. Prove your decision. Let C be a code.
(C ⊥ )⊥ = C
Solution 3.8.2
This statement is true for linear and cyclic code.
First, let C be a q-ary linear [n, k]-code with the generator matrix G and parity check
matrix H. Then the dual code D = C ⊥ is a linear [n, n − k]-code with the generator matrix
H = G0 and the parity check matrix H 0 . Let v ∈ V (n, q), v ∈ D ⇔ vGT = 0. The dual code
E = D⊥ is a [n, k]-code with generator matrix H 0 = G00 . Word w ∈ E ⇔ wG0T = 0 ⇔ wH T =
0.
28
3. C YCLIC C ODES
We can easily see that if wH T = 0 then w ∈ C. Because w is a word from code E then
C = E = D⊥ = (C ⊥ )⊥ .
Second, if C is a cyclic [n, k]-code with the generator polynomial g(x) of degree n−k then
the check polynomial h(x) = h0 + h1 x + · · · + hk xk and we know that g(x)h(x) = xn − 1. We
also know that the dual code D = C ⊥ is generated by the polynomial h(x) = hk + hk−1 x +
· · · + h1 xk−1 + h0 xk . According to the proof of polynomial representation of dual codes we
know that h(x) = xk h(x−1 ) and h(x)g(x) = 1 − xn . That means that g(x) = f (x) is the
check polynomial (of degree n − k) of code D. The dual code E = D⊥ is generated by the
polynomial f (x). Since polynomial f (x) = f0 + f1 x + · · · + fn−k xn−k then polynomial f (x) =
fn−k +fn−k−1 x+· · ·+f1 xn−k−1 +f0 xn−k . Since polynomial g(x) = g0 +g1 x+· · ·+gn−k xn−k and
polynomial g(x) = gn−k +gn−k−1 x+· · ·+g1 xn−k−1 +g0 xn−k = f (x) = f0 +f1 x+· · ·+fn−k xn−k
we get f0 = gn−k , f1 = gn−k−1 , . . . ,fn−k = g0 . And now, we can easily see that f (x) =
fn−k + fn−k−1 x + · · · + f1 xn−k−1 + f0 xn−k = g0 + g1 x + · · · + gn−k−1 xn−k−1 + gn−k xn−k = g(x)
and that means that C = hg(x)i = hf (x)i = E = D⊥ = (C ⊥ )⊥ . What we had to show.
29
Chapter 4
Secret key cryptosystems are very old. They were primarily used in pre-computer era – secret
key cryptosystems are too weak nowadays and too easy to break, especially with computers.
However, they can illustrate several ideas of cryptography and cryptoanalysis.
4.1 Cryptosystem
Every cryptosystem consists of a plaintext space P (set of plaintexts over an alphabet Σ), a
cryptotext space C (set of cryptotexts over an alphabet ∆) and a key space K (set of possible
keys).
Each key k determines an encryption algorithm ek and a decryption algorithm dk such,
that for any plaintext w, ek (w) is the corresponding cryptotext and it holds
w ∈ dk (ek (w)) or w = dk (ek (w))
As encryption algorithms we can also use randomized algorithms.
The philosophy of modern cryptoanalysis is embodied in the Kerckhoff’s principle for-
mulated in 1983: The security of a cryptosystem must not depend on keeping secret the
encryption algorithm. The security should depend only on keeping secret the key.
The requirements for good cryptosystem according to Sir F. R. Bacon are:
1. Given ek end a plaintext w, it should be easy to compute c = ek (w).
2. Given dk end a cryptotext c, it should be easy to compute w = dk (c).
3. A cryptotext ek (w) should not be much longer then the plaintext w.
4. It should be unfeasible to determine w from ek (w) without knowing dk .
5. The so called avalanche effect should hold: A small change in the plaintext, or in the
key, should lead to a big change in the cryptotext.
6. The cryptosystem should not be closed under composition.
7. The set of keys should be very large.
4.2 Cryptoanalysis
The aim of cryptoanalysis is to get as much information about the plaintext or the key as
possible.
Main types of cryptoanalytics attacks are: cryptotexts-only attack, known-plaintexts at-
tack, chosen-plaintexts attack, known-encryption-algorithm attack and chosen-cryptotext at-
tack.
30
4. S ECRET K EY C RYPTOGRAPHY
A cryptosystem is called secret key cryptosystem if some secret piece of information (the
key) has to be agreed first between any two parties that want to communicate through the
cryptosystem.
There are some basic types of secret key cryptosystems:
• substitution based cryptosystems – they substitute the characters of plaintext for an-
other characters;
– monoalphabetic cryptosystems – they use a fixed substitution, one character is
always replaced with the same group of symbols;
– polyalphabetic cryptosystems – the substitution keeps changing during the en-
cryption;
• transposition based cryptosystems – they only transpose the characters of plaintext,
for example permission/impression.
The cryptosystems can be also divided into block cryptosystems (cryptosystems that are
used to encrypt simultaneously blocks of plaintext) and into stream cryptosystems (cryp-
tosystems that encrypt plaintext letter by letter, the encryption may vary during the encryp-
tion process).
Stream cryptosystems are more appropriate in some applications (telecommunication),
usually are simpler to implement, faster and have no error propagation. In stream cryptosys-
tems is each block of plaintext encrypted using a different key.
In block cryptosystems, the same key is used to encrypt arbitrarily long plaintext block
by block.
31
4. S ECRET K EY C RYPTOGRAPHY
• If x and y are in the same row (column), then they are replaced by the pair of symbols
to the right (below) of them.
• If x and y are neither in the same row nor in the same column, then the smallest rectan-
gle containing x and y is taken and symbols x and y are replaced by the pair of symbols
in the remaining corners of the rectangle.
32
4. S ECRET K EY C RYPTOGRAPHY
4.5 Exercises
Exercise 4.1
Decode the following cryptotexts:
1. TEVSECMKOCKB
2. TSRLNCHHIAFCIEISIEEPR
3.
4. (Playfair cipher, password: PLAYFAIR)
BKLBPGQXKGFQTNQOKU
Solution 4.1.1
1. This is a Caesar cryptosystem and every letter of the message is shifted by 10 posi-
tions ahead. To decode the message we need to shift every letter of the cryptotext 16
positions ahead or 10 positions backwards. Using this algorithm we get the message
”Julius Caesar”.
2. To decode this cryptotext is a bit harder. When we put the message into a table, in the
columns we can see the hidden message. The message is ”This is rail fence cipher” as
you can see in the Table 4.1.
33
4. S ECRET K EY C RYPTOGRAPHY
T S R L N C H
H I A F C I E
I S I E E P R
3. To decode this cryptotext we need the tables shown at the Figure 4.1.
Figure 4.1: Tables for encoding and decoding messages using Pigpen cipher
4. Decode this cryptotext is very easy because we know that it is encoded with Playfair
cipher and we also know the key. We just need to design the playfair square (see Table
4.2). The hidden message is ”Charles Wheatstone x”.
P L A Y F
I R B C D
E G H K M
N O Q S T
U V W X Z
Exercise 4.2
Consider the following variation of the one time pad cryptosystem. Let P = K = {00, 01, 10}l .
Encryption and decryption work in the same way as in the one time pad. Decide whether
this cipher is perfectly secure. Explain your reasoning.
Solution 4.2.1
A cipher is perfectly secure, if |P | = |K| = |C|. But we can see that C = {00, 01, 10, 11} and
therefore |P | = |K| =
6 |C|. Now, it is obvious that this cipher is not perfectly secure.
Exercise 4.3
You have found an old cryptotext and you know that the plaintext is related to cryptography.
You suppose Vigenere cryptosystem was used so you looked for repeated strings in the
cryptotext. You found that the string TICRMQUIRTJR occurs twice in the cryptotext. The
first occurrence begins at position 10 and the second one at position 241. You guess this
cryptotext sequence is the encryption of the plaintext word CRYPTOGRAPHY. If you are right,
what would be the key?
34
4. S ECRET K EY C RYPTOGRAPHY
Solution 4.3.1
Now, we can see, that the length of the keyword is not 3 because there are more then
three letters in the keyword. The length of the keyword is not 11 either because the 10th
and 21st position of the keyword are different. Hence the length of the codeword must be
7 and we can see that the 10th and 17th position are equal and so on. Now we must find
the beginning of the repeated keyword. Because its length is seven, it starts at 1st, 8th, 15th,
22nd . . . position of the cryptotext – and here we get the keyword ”Correct”.
There is also another possibility – the length of the keyword is more then 11 symbols. In
this case we are unable to say anything about the keyword.
Exercise 4.4
Alice used Vigenere cryptosystem for encryption but she has become afraid that it can be eas-
ily broken. She is now considering using double encryption, that means sender and receiver
agree on two keywords key1 and key2 and sender encrypts message m by first encrypting it
with Vigenere cipher using the key key1 and then encrypting the resulting cryptotext with
Vigenere cipher using the key key2 . Show that the proposed encryption has actually the same
effect as a single Vigenere encryption using a keyword key3 and describe how to find this
keyword. What can you say about security of the double encryption?
Solution 4.4.1
Let ek be the encryption algorithm and let dk be the decryption algorithm described in the
materials (lecture 4, site 6, 7). Let w be the plaintext and key1 and key2 be the two given
keywords. We take w[i] as the ith letter of the message w, key1 [i0 ] as the i0 th character of
keyword key1 where i0 ≡ i (mod |key1 |).
The algorithm for encrypting the message w is following:
35
4. S ECRET K EY C RYPTOGRAPHY
where key3 [i000 ] ≡ key1 [i0 ] + key2 [i00 ] (mod 26) and i ≡ i0 ≡ i00 ≡ i000 (mod |key3 |).
Hence we can see that double Vigenere encryption using keywords key1 and key2 has
the same effect as a single Vigenere encryption using the keyword key3 . The length n of
the keyword key3 is lcm{|key1 |, |key2 |}. And key3 [i] = (key1 [i] + key2 [i]) (mod 26) where
i ∈ {1, . . . , n}. Using double Vigenere encryption can be more secure than single Vigenere
encryption only because of the length of key3 : |key3 | ≥ max{|key1 |, |key2 |}.
But encryption using key3 , which is dependent on keys key1 and key2 is less secure than
encryption using a randomly chosen key keyr with the same length as key3 .
Exercise 4.5
Enigma was a family of portable electromechanical rotor machines used to encrypt and de-
crypt secret message during WorldWar II. Consider the following Enigma machine whose
key consists of
• initial position of three different exchangeable rotors, each rotor has 26 different posi-
tions;
• plugboard setting allowing six swaps of two different characters from 26-letter alpha-
bet.
36
4. S ECRET K EY C RYPTOGRAPHY
2. Key length is dlog2 (1.1133930437350656·1016 )e = 54, i.e. 54 bits are necessary to encode
key.
1
3. Using exhaustive key search we need to check ≈ 2 · 1.1 · 1016 keys on average. The
complexity is therefore big.
Exercise 4.6
Find the key and decode the following ciphertext produced by Vigenere cryptosystem.
AjcrvqtvixmrgvlslkraykefqrlzsD4Mragocaskhym"Wuhtgmteoo",i
patvnihjwebijwjkgzuclthhrkpxlzs26cmxletusfsgdipdpcrijjwsjsf.
Tjkwwpqcchwdvjifwsunsjaugggfrfxijavqv,jwouqrytjgpsedjirv
wtkxafukpidevvijkrfer.LhgUgzjszjqsxycwhdotmhgnvqtgxhymIfiioe
esqyqrwapfaskqfvrwcvghlghympsmrrefwz;kwmfsvcpdlvvxvanvgv,lzs
ciqhcqxijsbuipdlkillticjwzafvstwfvusnef.Dikarvamlsjcrvabvaw,
atkotjgjvlshetcxagbrtwwcwtmlq:hymwagpcpgxtzkijnqnsfysipevtq
uiwlvvxpsipvipl,ojblwptkrlwfdqkztjczwtsvvmfsvcpdwrzvxze
ectlswe’agsbkpsxsgljqsrkpi,kghyixlhgumyfocasxfkeijvwublw
tarmfyoelowyjcrvdweofmtpgzwjurqrwdmpsodsuoigfuggjwhimgwixgh
hdozvxwxvkrxgfdixaop.
Crglvvzeucguwgjmniwlhgtieghvteeprcrwd.Wwblwmcelafsniwwqwkthwr
nqxzapgbljogirwl,vjiogcumruaugsxlvvMragocaskkzlijapfggmzu
axgrgvlwwlkzehapgp.Lzsimasscneehdrvidvgtwagbkpelcqwpvts
twrfeevivstkmvoatfw,tmhkpelrgsyajsu,ryktcuaalvkpiKcjtiatarf,
xzencqhhoempsnfnmyzhscptsvqfwjsdwzwd.Vjijwafbihapgpesrvqx
houumtdswwvspgtwgfhfzisdvjivwqigtlefvipl,kzblguvimnabxblw
orgvslciigueuuxgah.
Zv1944,xzeNwjloownianvtsvmqvlefezvvshzlofgatfwoahtp,gslnghlzs
Lpv(ulqeo).Lzsimasscnmllzvjsp,cqpxsabzvkssykxuzkzbl40
houkxagbj.Qxjerneuwrkpivehcydldcckk.Ahvijuceviutkpklzsgtyys,cu
hwlsiumfefkrlzsuimdymgckzsvb,xzeqrijshfzggunfxmjbkpikwkvgzab
fvigfvji40hggzbmgnu,geuzdfamliqpvwkicbmfgkpevatwmvwnv
esetweixaopqjhdixemjipi.Qgkhfnxzeugtdmutwrfeevmgfoim,yflkmi
lzsumjsunvtdmuj,vslpckv-oagv.
Solution 4.6.1
To decrypt the text we use the Kasiski method and frequency analysis. For example, the se-
quence LZSIMASSCN starts at 719th and at 1005th position; the sequence MRAGOCASK starts
at 30th and at 679th position; the sequence TWRFEEV starts at 755th position and at 1250th
position. The distances between their first and second occurrences are 286, 649 and 495. Be-
cause gcd(286, 649, 495) = 11 the length of the keyword is most likely eleven. Now, it is easy
to find that the keyword is ”ACCESSORIES”. And here is the hidden message:
A handy feature that was used on the M4 Enigma was the "Schreibmax", a
little printer which could print the 26 letters on a small paper ribbon.
This excluded the need for a second operator, reading the lamps and
writing the letters down. The Schreibmax was placed on top of the Enigma
machine and was connected to the lamp panel; to install the printer, the
lamp cover and all light bulbs had to be removed. Besides its handiness,
it improved operational security: the signal officer no longer had to
37
4. S ECRET K EY C RYPTOGRAPHY
see the plaintext, as the printer might have been installed in the
captain’s cabin of a submarine, so that the signals officer did the
typing and keyhandling but never gained knowledge of secret received
plaintext information.
Another accessory was the remote lamp panel. If the machine was equipped
with an extra panel, the wooden case of the Enigma was wider and could
store the extra panel. There was a lamp panel version that could be
connected afterwards, but that required, just as with the Schreibmax,
the lamp panel and light bulbs to be removed. The remote panel made it
possible for a person to read the decrypted text, without giving the
operator access to it.
38
Chapter 5
The main disadvantage of the classical cryptography is the need to send a long key through
a absolutely secure channel before sending the message itself. In secret key (symmetric key)
cryptography both sender and receiver share the same secret key. In public key cryptography
there are two different keys – a public (encryption) key and a secret (decryption) key. The
basic idea is that if it is infeasible from the knowledge of encryption algorithm ek to construct
the decryption algorithm dk then ek can be made public.
The main problem of secret key cryptography is secure distribution of the key before trans-
mission. The problem was solved in 1976 by Diffie and Hellman. They designed a protocol
for secure key establishment over public channels.
If two parties, Alice and Bob, want to create a common secret key, then they first agree
on a large prime p and a primitive root q (mod p) and then they perform, through a public
channel, the following activities:
• Bob also chooses randomly a large integer 1 ≤ y < p − 1 and computes Y = q y mod p.
• Alice and Bob exchange X and Y through a public channel and keep x and y secret.
• Alice computes Y x mod p and Bob computes X y mod p. Then each of them has the key
K = Y x mod p = X y mod p = q xy mod p.
Blom’s protocol allows trusted authority (Trent) to distribute secret keys to 21 n(n − 1) pairs
of n users. Let a large prime p > n be publically known. The protocol goes as follows:
• Each user U in the network is assigned by Trent a unique public number rU < p.
• For each user U Trent calculates two numbers aU = (a + brU ) mod p and bU = (b +
crU ) mod p and sends them via his secure channel to U .
39
5. P UBLIC K EY C RYPTOGRAPHY
• If Alice (A) wants to send a message to Bob (B), then Alice computes her key KAB =
gA (rB ) and Bob computes his key KBA = gB (rA ).
• It is easy to see that KAB = KBA and therefore Alice and Bob can now use their keys
to communicate using some secret key cryptosystem.
Modern cryptography uses such encryption methods that no enemy can have enough com-
putational power and time to do decryption. Modern cryptography is based on positive and
negative results of complexity theory – on the fact that for some algorithm problems no effi-
cient algorithm seem to exist, and for some small modifications of these problems there is a
simple, fast and good enough (randomized) algorithm.
Discrete logarithm problem Given x, y, n, the task to compute a such that y ≡ xa (mod n)
is infeasible.
Discrete square root problem Given y, n, the task to compute x such that y ≡ x2 (mod n)
is infeasible in general, but easy if n is a prime.
The most important public key cryptosystem is the RSA cryptosystem. It was invented in
1978 by Rivest, Shamir and Adleman. The basic idea is that prime multiplication is very
easy but integer factorization seems to be infeasible. To design a RSA cryptosystem we need
to choose two large primes p and q and compute n = pq. Then choose a large d such that
gcd(d, ϕ(n)) = 1, where ϕ is Euler’s function, and compute e = d−1 (mod ϕ(n)).
The public key is modulus n and encryption integer e. The private key is p, q and de-
cryption integer d. The plaintext is first encoded as a word over alphabet {0, 1, . . . , 9}, then
divided into blocks of length i − 1 where 10i−1 < n < 10i . Each block is then taken as an
integer and encrypted. The encryption of a plaintext w is the cryptotext c = we mod n. The
decryption of a cryptotext c is the plaintext w = cd mod n.
One of the key problems for the development of RSA cryptosystem is the prime recogni-
tion. Rabin-Miller’s prime recognition algorithm is based on the following result of number
theory.
Let n ∈ N, for 1 ≤ x ≤ n, C(x) denotes the following condition: ”Either xn−1 6= 1
(mod n) or there is an m = n−12i
for some i, such that gcd(n, xm − 1) 6= 1”. If C(x) holds for
some 1 ≤ x ≤ n, then n is not a prime. If n is not a prime, then C(x) holds for at least half of
x between 1 and n.
The algorithm goes as follows:
40
5. P UBLIC K EY C RYPTOGRAPHY
• If C(xi ) holds for some xi then n is not a prime for sure. Otherwise n is prime with the
probability of error 2−m .
5.6 Exercises
Exercise 5.1
1. Compute 7120007 mod 143 by hand. (Use Chinese Remainder Theorem and Fermat’s
Little Theorem.)
Solution 5.1.1
Let x be a number such that x ≡ 7120007 (mod 143). Then we have two congruences:
41
5. P UBLIC K EY C RYPTOGRAPHY
Exercise 5.2
Alice and Bob computed a secret key k using Diffe-Hellman protocol with p = 467, q = 4,
x = 400 and y = 134. Later they computed another secret key k 0 with the same p, q, y and
with x0 = 167. They became very surprised after finding that k = k 0 . Determine the value of
both keys and explain why the keys are identical.
Solution 5.2.1
The secret key k is computed as k = q xy mod p. For the values p = 467, q = 4, x = 400 and
y = 134 we have k = 161. When we get the same value of secret key after choosing another
0
x, say x0 = 167, it means that q xy mod p = k = k 0 = q x y mod p.
Euler’s Totient Theorem says that nϕ(m) ≡ 1 (mod m). It’s corollary is that nϕ(m)+k ≡ nk
(mod m). According to this theorem and corollary we have k = q xy mod p = q iϕ(m)+k mod
0
p ≡ q k mod p ≡ q jϕ(m)+k mod p = q x y mod p = k 0 . Because p is a prime ϕ(p) = p − 1 and we
can see that xy ≡ x0 y (mod p − 1): xy = 400 · 134 ≡ 10 (mod 466) and x0 y = 167 · 134 ≡ 10
(mod 466).
Now it is easy to see that k = 4iϕ(467)+10 mod 467 = 410 mod 467 = 161 = k 0 .
Exercise 5.3
Solution 5.3.1
Table 5.1: The design of the run of modified Diffie-Hellmann key exchange protocol
secret key, but it is easy for her to compute q −1 mod p such that q −1 q ≡ 1 (mod p). She also
knows that qk = qxqy mod p. Here she gets k = q −1 qk = q −1 qxqy mod p = qxy mod p.
42
5. P UBLIC K EY C RYPTOGRAPHY
Exercise 5.4
Consider RSA cryptosystem. Is it possible to decrypt a ciphertext by repeated encryption of
the ciphertext?
Exercise 5.5
Consider the following modification of RSA cryptosystem.
Public key is a pair (n = pq, e) which is defined in the same way as in the standard RSA.
Private key is a quintuple (p, q, dp , dq , qinv ) where dp = e−1 mod p − 1, dq = e−1 mod q − 1
and qinv = q −1 mod p. Message m is encrypted by computing c = me mod n. Decryption is
realized by computing numbers mp = cdp mod p, mq = cdq mod q and h = qinv (mp −mq ) mod
p from which the original message m = mq + hq can be reconstructed. Show correctness of
the described cryptosystem.
Solution 5.5.1
In plain RSA it holds that m ≡ cd (mod n). We can see that dp = d mod p − 1 and dq =
d mod q − 1. Since
w ≡ cd ≡ cd mod p−1 ≡ cdp (mod p) and
d d mod q−1 dq
w≡c ≡c ≡c (mod q)
we can see that
m ≡ med ≡ cd ≡ cdp ≡ mp (mod p) and
m ≡ med ≡ cd ≡ cdq ≡ mq (mod q).
Now, we need to show that message m0 = mq + hq is the original message. To do that we
need to verify whether m0 ≡ mp (mod p) and m0 ≡ mq (mod q). We can easily see that
m0 = mq + hq ≡ mq (mod q).
To show that m0 ≡ mp (mod p) we will start with mp − mq ≡ mp − mq (mod p). Because
qqinv ≡ 1 (mod p), we get
qinv (mp − mq ) q ≡ mp − mq (mod p)
| {z }
h
0
m = mq + hq ≡ mp (mod p).
43
5. P UBLIC K EY C RYPTOGRAPHY
Because |m| < pq, m0 ≡ m (mod p) and m0 ≡ m (mod q) it must hold that m = m0 and
hence the modified cryptosystem is correct.
Exercise 5.6
We want to set up RSA cryptosystem in a network of n users.
2. Now consider we want to reduce this number by generating a smaller pool of prime
numbers and making combinations of two of these primes (for each user we pick up a
new pair). How is security of RSA cryptosystem affected?
Solution 5.6.1
1. We need two primes for each user, so we need to generate 2n prime numbers.
2. When there is only k prime numbers, we can calculate gcd(ni , nj ) where ni , nj are
public keys of some users Ui and Uj of the network. When gcd(ni , nj ) = x > 1, we can
compute such y and y 0 that xy = ni and xy 0 = nj . Because ei and ej are known, we can
also compute di and dj .
Now, we know the private keys of users Ui and Uj of the network and we can read
messages addressed to them. This cryptosystem is not secure.
44
Chapter 6
Rabin cryptosystem is based on the discrete square root problem. To design Rabin cryptosys-
tem, we need to find two primes p and q of the form 4k + 3 (i.e. p ≡ q ≡ 3 (mod 4)). The
public key is number n = pq, primes p and q are kept secret. We can see that n is a Blum
integer what is important because of its nice properties.
The encryption of plaintext w < n is the cryptotext c = w2 mod n. The decryption of
cryptotext c is done when all square roots of c modulo n are found. Because n is a Blum
p+1 p+1 q+1 q+1
integer, we can see that w ∈ {c 4 mod n, p − c 4 mod n, c 4 mod n, q − c 4 mod n}. In
case the plaintext w is a meaningful text, it should be easy to determine the value of w.
However, if w is a random string, for example a secret key, it is impossible to determine the
value of w.
to calculate abx mod p = ba−x mod p = w. The security of ElGamal cryptosystem is based
on the discrete logarithm problem. As we can see, the cryptosystem is not secure under a
chosen cryptotext attack – for an encryption c = (a, b) of message m we can easy construct
an encryption c0 = (a, 2b) of message 2m.
6.3 Exercises
Exercise 6.1
Show that with a chosen-ciphertext attack on RSA cryptosystem one can decrypt an arbitrary
ciphertext with only one query.
45
6. O THER P UBLIC K EY C RYPTOSYSTEMS
Our task is to decrypt an arbitrary ciphertext c using the chosen-ciphertext attack. Say the
ciphertext c was sent to Alice, whose public key is (n, e).
We need to choose a random integer r ∈ Z∗n and compute c0 = re c mod n where e is
Alice’s public key. We send c0 to Alice and she sends back d(c0 ) = c0d mod n. Because c0d mod
n = red cd mod n = rw mod n we have the message we are looking for multiplied by r. To
get our message w, we need to calculate r−1 c0d mod n = r−1 rw mod n = w.
Exercise 6.2
What is the probability that two students of IV054 have the same birthday (74 students at-
tend IV054 course at this moment)?
Solution 6.2.1
Let p(n) be the probability that two out of n students of IV054 have the same birthday.
Let p(n) = 1 − p(n) be the probability that no two students have the same birthday. Then
the probability p(n) is computed this as p(n) = 1 − p(n). The probability p(n) is computed as
follows:
365(365 − 1) · · · (365 − n + 1) 365!
p(n) = =
365n 365n (365 − n)!
365!
For n = 74 we have p(74) = ≈ 0, 00035 and p(74) = 1 − p(74) ≈ 1 −
36574 (365− 74)!
0, 00035 ≈ 0, 99965.
The probability that two students of IV054 have the same birthday is greater than 99,9%,
although it cannot be 100% unless there are at least 365 people attending the course.
Exercise 6.3
Prove or disprove the following implication. Let g, h be generators of the group (Z∗p , ·) where
p is an odd prime. Suppose g 2u ≡ h2v (mod p). Then g u ≡ hv (mod p).
Solution 6.3.1
The condition is not true. We can choose the prime p = 7, then we can see that the group
(Z∗7 , ·) has two generators g = 3 and h = 5. When choosing u = 1 and v = 2 we get:
Exercise 6.4
Let p be a large prime, g a generator of the group (Z∗p , ·) and y = g x mod p. Show that it is
p−1
possible to find the least significant bit of x by computing y 2 mod p.
46
6. O THER P UBLIC K EY C RYPTOSYSTEMS
To find the least significant bit of x is equal to find the parity of x. Because g is a generator
p−1
of group (Z∗p , ·) then Z∗p = g i for 1 ≤ i ≤ p − 1. The expression y 2 mod p can be rewrite as
xϕ(p)
g 2 mod p (using the Euler’s Totient Theorem). Now, we need to discuss two cases:
p−1 xϕ(p) 2kϕ(p)
x = 2k: If x is even then the least significant bit is 0. We can see that y 2 =g 2 =g 2 =
g kϕ(p) ≡ 1 (mod p).
p−1 xϕ(p)
x = 2k + 1: If x is odd then the least significant bit is 1. We can see that y 2 = g 2 =
(2k+1)ϕ(p) ϕ(p)
g 2 = g kϕ(p) g 2 6≡ 1 (mod p) because g is the generator of group Z∗p .
p−1
We can say that if y 2 ≡ 1 (mod p) then x is even.
Exercise 6.5
Suppose that D is any algorithm that computes one plaintext (of the four possible plaintexts)
corresponding to a valid cryptotext y. We choose a random x ∈ Z∗n and compute y = x2 mod
n. Now we need to compute x0 = D(y) and we get x2 ≡ x02 (mod n). The probability that
x = ±x0 (mod n) is 0,5. In this case we have x2 −x02 = (x−x0 )(x+x0 ) ≡ 0 (mod n) but neither
factor is equal to 0 modulo n. Therefore, gcd((x − x0 ), n) = p or q and the factorization of n is
obtained. After two attempts (on average) n is factored. Therefore any decryption algorithm
can be used to factor n efficiently.
After factoring n we can decipher any ciphertext using normal decoding strategy of Ra-
bin cryptosystem.
Exercise 6.6
Let p be a 1024-bit prime. Let g have order q in (Z∗p , ·), where q is a 160-bit prime. Consider
the following modification of ElGamal cryptosystem. Private key x is a randomly chosen
element of {1, . . . , q − 1}. Public key is y = g x mod p. Message m is encrypted by computing
pair
c = (g r mod p, y r g m mod p),
3. Computing discrete logarithms is hard in (Z∗p , ·). In general, the receiver is not able
to recover m from g m mod p. Assume the sender only sends messages from the set
{0, . . . , 100}. Show that the receiver can recover m.
47
6. O THER P UBLIC K EY C RYPTOSYSTEMS
5. Suppose the receiver is conducting an auction in which two bidders encrypt their bids
using the scheme described above. Suppose also that both bidders can bid at most
$100. The bidder who goes second can eavesdrop messages between the receiver and
the first bidder. Show that he can almost always bid $1 more than the first bidder (even
without knowing the value of his bid).
Solution 6.6.1
1. Because q is a prime and it is order of some element of Z∗p then q must be a factor of
p − 1 because every order of some group element divides the size of the group. To get
any element of order q we randomly choose a number x from 1 to p − 1 and when it
holds xq ≡ 1 (mod p) then x might be g. There is more elements of order q, not only g.
Because q is a prime we don’t need to check whether it has a smaller order – there are
no factors of q.
b yr gm g rx g m
= = = gm
ax g rx g rx
Because a is an element of a group, there must be some inverse a−1 in the group and
this element and multiplication is used instead of a and division.
3. Because the set M = {1, . . . , 100} is finite, we can calculate the values of g a for each
a ∈ M in a finite time. We save our results in a table as a couple (g a , a). Then we can
easily search the table for a g m and find the appropriate m.
4. Let c0 be the cryptotext for message m0 = m1 + m2 mod q and let ci be the cryptotext
for message mi , then c0 = c1 · c2 :
This is correct because r1 , r2 are random numbers so as s and g m1 +m2 ≡ g m1 +m2 mod q
(mod p) because m1 + m2 = kq + l and (m1 + m2 ) mod q = l we get g kq+l = g kq g l = g l
because q is the order of g.
5. The second bidder needs to calculate c1 for message m1 = 1. When he eavesdrops the
encrypted bid bi he just calculates c1 · bi and sends it to the auctioneer. He bids $1 more
than the first bidder except for the case that the first bidder bids $100. In this case, the
second bid is not valid.
48
Chapter 7
Digital Signature
Digital signatures are one of the most important applications of modern cryptography. Digi-
tal signatures are such that each user is able to verify signatures of other users, but that gives
him no information about how to sign a message on behind of other users.
An important difference from handwritten signature is that digital signature of a message
is intimately connected with the message and for different messages is different, whereas the
handwritten signature is adjoint to the message and always looks the same.
Technically, a digital signature is performed by a signing algorithm and it is verified by
a verification algorithm. It can be used any public key cryptosystem in which the plaintext
space and cryptotext space are the same.
The signature of message w denoted as sig(w) is dU (w) so as everyone can verify that the
message was sent by user U . If the signature is important only for user V , then the signature
is computed as eV (dU (w)). Now, only user V can verify, that the message was signed by user
U.
Digital signature allows anyone to verify signature of sender S without providing any infor-
mation about generating signatures of S.
A digital signature scheme (M, S, Ks , Kv ) is given by a set of messages to be signed (M ),
a set of possible signatures (S), a set of private keys for signing (Ks ) and a set of public keys
for verification (Kv ).
It is required, that for each key k from Ks , there exists a single and easy to compute
signing mapping sigk : {0, 1}∗ × M → S, and for each key k from Kv , there exists a single
and easy to compute verification mapping verk : M × S → {true, f alse} such that the
following conditions are satisfied:
Correctness: For a message m ∈ M and public key k ∈ Kv , it holds verk (m, s) = true if
there is an r ∈ {0, 1}∗ such that s = sigl (r, m) for a private key l ∈ Ks corresponding
to the public key k.
Total break: The adversary manages to recover secret key from the public key.
49
7. D IGITAL S IGNATURE
Universal forgery: The adversary can derive from the public key an algorithm which allows
him to forge signature of any message.
Selective forgery: The adversary can derive from the public key a method to forge signa-
tures of selected messages (where the selection was made prior the knowledge of the
public key).
Existential forgery: The adversary is able to create from the public key a valid signature of
some message m (but has no control for which m).
Let us have an RSA cryptosystem with encryption and decryption exponents e and d. The
signature of message w is a couple s = (w, σ) where σ = wd mod n. The signature s is valid
if σ e = w mod n.
There are some known attacks on this scheme. The forger can use some public key e
to compute we , the signature of this message is s = (we , w). Everybody, who verifies the
signature, finds out that the signature is valid. The forger has no control over the content of
the message we – it is an example of existential forgery.
Another attacker can produce some new valid signatures without the knowledge of se-
cret key – when he obtains signatures s1 = (w1 , σ1 ) and s2 = (w2 , σ2 ), he can compute
valid signatures of messages w1 w2 and w1−1 . The signatures are s12 = (w1 w2 , σ1 σ2 ) and
s0 = (w1−1 , σ1−1 ).
The public key for ElGamal signature scheme is K = (p, q, y) where p is a prime, q is a
primitive element of Z∗p and y = q x mod p. The integer 1 ≤ x < p is secret key and is used
for signing messages.
To create the signature s of a message m we need to choose a random integer r ∈ Z∗p−1 .
The signature s = sig(m, r) = (a, b) where a = q r mod p and b = (m − ax)r−1 mod p − 1.
The signature s = (a, b) of message m is valid if y a ab ≡ q w (mod p).
There are ways of producing (using ElGamal signature scheme) valid forged signatures,
but they do not allow the forger to create signature of message of his choice (see Exercise
7.1). There are also several ways of breaking the ElGamal signatures if these schemes are
used not carefully enough. If the random integer r of some signature is known, the forger
can compute the secret key x and then forge signatures at will. Another misuse of ElGamal
signature scheme is to use the same r to sign two messages. In such a case the secret key x
can be computed.
50
7. D IGITAL S IGNATURE
To sign a message w we need to choose a random integer k such that 0 < k < q and
gcd(k, q) = 1. The signature of message w is s = sig(w, k) = (a, b), where a = (rk mod
p) mod q and b = k −1 (w + xa) mod q, where kk −1 ≡ 1 (mod q). The signature s = (a, b) is
valid if (ru1 y u2 mod p) mod q = a, where u1 = wz mod q, u2 = az mod q and z = b−1 mod q.
A subliminal channel is a covert communication channel among a set of users, such that
everybody can see their common messages, without any secret information. The top secret
message is hidden in the sent message and its signature.
To set up a subliminal channel, the users need to choose a large n and an integer k such
that gcd(k, n) = 1. The public key is a couple (n, h), where h = k −2 mod n = (k −1 )2 mod n.
The secret key for all subliminal channel users is k.
When some user wants to send a secret message w, he needs to choose another harm-
less message w0 . The messages have to be such that gcd(w, n) = 1 and gcd(w0 , n) = 1. The
signature of those messages is s = (S1 , S2 ), where
1 w0
S1 = + w mod n and
2 w
k w0
S2 = − w mod n.
2 w
The signature s is for everybody just a signature of message w0 , for the subliminal channel
users are the message w0 and its signature s only numbers needed to get the hidden message
w.
The signature of message w0 is valid if S12 − hS22 mod n = w0 . The hidden message w can
be obtained by computing w0 (S1 + k −1 S2 )−1 mod n.
Lamport signature scheme shows how to construct a signature scheme for only one use from
any one way function. This signature cannot be forged because we are unable to invert the
one way function. On the other hand, Lamport signature scheme can be used to sign only
one message.
Let k be a positive integer and let P = {0, 1}k be the set of messages. Let f : Y → Z be a
one way function where Y is a set of partial signatures. Y = {yij |1 ≤ i ≤ k, j = 0, 1}, where
yij is chosen randomly and Z = {zij |zij = f (yij )}. The key K consists of f , Y and Z – Y is
the secret key; f and Z are public.
The signature of message x ∈ P is s = sig(x1 . . . xk ) = (y1x1 , . . . , ykxk ). The signature
s is denoted as (a1 , . . . , ak ). The signature s of message x is valid if f (ai ) = zixi for each
i ∈ {1, . . . , k}.
7.8 Exercises
Exercise 7.1
Consider the DSA signature scheme. Show that is possible to recover the secret key in the
following situations.
51
7. D IGITAL S IGNATURE
1. A signer has precomputed one pair k, a with a = (rk mod q) mod p and always uses
this pair to sign his messages.
2. A signer creates a signature (a, 0) for some message w.
Solution 7.1.1
1. When we eavesdrop two messages w1 and w2 and their signatures (a, b1 ) and (a, b2 )
sent by the signer, we can calculate the value of k – one of the secret keys. Suppose that
b1 > b2 and kk −1 ≡ 1 (mod q), then we have:
b1 = k −1 (w1 + xa) mod q
b2 = k −1 (w2 + xa) mod q
b1 − b2 = k −1 (w1 + xa) − k −1 (w2 + xa) mod q
k(b1 − b2 ) ≡ w1 + xa − w2 − xa ≡ w1 − w2 mod q
k = (w1 − w2 )(b1 − b2 )−1 mod q
where (b1 − b2 )(b1 − b2 )−1 ≡ 1 (mod q). Because q is a prime and 0 < b1 − b2 < b1 < q
we know that gcd(b1 −b2 , q) = 1. Hence we can calculate (b1 −b2 )−1 using the Extended
Euclidean algorithm and the Bezout’s identity.
With the knowledge of the value of k we can recover x:
b1 = k −1 (w1 + xa) mod q
kb1 = w1 + xa mod q
kb1 − w1 ≡ xa mod q
x = (kb1 − w1 )a−1 mod q
where aa−1 ≡ 1 (mod q). The value of a−1 can be calculated the same way as de-
scribed above. Now, we know the value of the secret key x and we can send messages
pretending to be someone else.
2. When we sent a message w with signature (a, 0), anybody can compute the value of
our secret key x:
0 = b = k −1 (w + xa) mod q
k0 = 0 = w + xa mod q
xa ≡ −w mod q
x = −wa−1 mod q
where aa−1 ≡ 1 (mod q). Whoever did this calculation, can now send messages with
our signature.
Exercise 7.2
Consider the ElGamal digital signature scheme. A valid signature pair (a, b) for a random
message w can be constructed as follows:
a = q i y j mod p,
b = −aj −1 mod (p − 1),
w = −aij −1 mod (p − 1),
52
7. D IGITAL S IGNATURE
1. Let p = 1367, q = 5 be the generator of Z∗p and y = 307 be the public key. Using
the construction described above find a signature and the corresponding message for
parameters i, j of your choice. Show derivation steps for equations for a, b and w.
Verify this signature.
Solution 7.2.1
1. We can choose for example i = 10 and j = 3. First we need to calculate j −1 such that
jj −1 ≡ 1 (mod p − 1). From Extended Euclidean algorithm we have 1366 = 3 · 455 + 1,
that means that 1 = 1366 − 3 · 455. Here we get 1 ≡ 3 · (−455) (mod 1366), hence
j −1 ≡ −455 ≡ 911 (mod 1366).
Now, we can calculate a, b and w.
The signature (a, b) of message w is valid if y a ab ≡ q w (mod p). And we can see that
−1 −1 mod (p−1) −ajj −1 mod (p−1)
y a ab = y a a−aj mod (p−1)
= y a q −aij y
−aij −1
= y a−a q mod (p−1)
= qw .
2. This forgery attack can be prevented by signing only the hash of the message. The
forger can determine the signature only for a random messages – he can never sign
any message of his choice, provided the security wasn’t broken.
Exercise 7.3
Consider the RSA signature scheme with the public key (n = 9797, e = 131). Decide whether
the following signatures are valid.
53
7. D IGITAL S IGNATURE
Solution 7.3.1
The signature of message w using RSA signature scheme is sig(w) = wd mod n. We can ver-
ify the signature by calculating sig(w)e mod n. The signature is valid if sig(w)e ≡ w (mod n).
The couple (n, e) is the public key.
Exercise 7.4
Consider the following signature scheme. Alice chooses two large secret primes p, q and
computes their product n. She also chooses an element g ∈ {0, . . . , n − 1} such that g gener-
ates a subgroup of order r in (Z∗n , ·), where r is a large prime.
Alice’s public key is a pair (n, g), her private key is a number r.
To sign a message m, Alice finds x such that xm ≡ 1 (mod r). Then she computes the
signature s = g x (mod n). Suppose Bob has received a pair (m, s) from Alice.
3. Show that if r is a factor of exactly one of these numbers then one can factor n using
only a public key.
Solution 7.4.1
2. The size of group (Z∗n , ·) is |Z∗n | = ϕ(n) = (p − 1)(q − 1). Let a be an element from Z∗n ,
let k be the order of a then k divides |Z∗n | thus k divides (p − 1)(q − 1). In our case k = r
and r is a prime. That means that r divides (p − 1) or r divides (q − 1). In other words
r is factor of at least one of numbers (p − 1), (q − 1).
3. Suppose, r divides q − 1 and does not divide p − 1. Because g r mod n = 1 it must also
hold that g r mod p = 1. Since r is a prime, we can see that the order of g in Z∗p is r or 1.
Because r does not divide p − 1 (the size of group Z∗p ), the order of g must be 1. Here
we get g mod p = 1 and g − 1 mod p = 0. Now it is obvious that g − 1 and n have a
common divisor. Hence, to factor n we merely have to compute the greatest common
divisor of numbers g − 1 and n and we get gcd(g − 1, n) = p.
54
7. D IGITAL S IGNATURE
Exercise 7.5
Prove that the Ong-Schnorr-Shamir subliminal channel scheme is correct.
Solution 7.5.1
Ong-Schnorr-Shamir subliminal channel scheme is correct. Because w and w0 are coprimes
to n, we have gcd(n, w) = gcd(n, w0 ) = 1. Therefore we can easily calculate w−1 mod n and
w0−1 mod n using the Extended Euclidean algorithm and the Bezout’s identity.
We can see that the verification is correct:
1 w0
0
−1 −1 k w
S1 + k S2 mod n = +w +k − w mod n
2 w 2 w
1 w0 w0
= +w+ − w mod n
2 w w
2(w0 w−1 )
= mod n = w0 w−1 mod n
2
And now, we can see that
Exercise 7.6
Assume that in the Lamport signature schemes two k-tuples, x and x0 , were signed by Bob.
Let f = d(x, x0 ) be the Hamming distance of x and x0 . How many new messages is an
adversary able to sign in such a case?
55
Chapter 8
Cryptography based on manipulating with points of so called elliptic curves is getting mon-
ument and it has tendency to replace public key cryptography based on infeasibility of fac-
torizing integers or of computing discrete logarithm.
The main advantage of elliptic curves cryptography is that to achieve a certain level of
security shorter keys are required then in case of classical cryptography. Using shorter keys
can result in savings in hardware implementations. The second advantage of elliptic curves
cryptography is that many attacks available for cryptography based on factorization and
discrete logarithm do not work for elliptic curves cryptography.
E : y 2 = x3 + ax + b,
where a, b are (for our purposes) either rational number or integers (mod n), we extend all
the points of the graph by a point of infinity, denoted as O, that can be regarded as sitting at
the top and the bottom of y-axis at the same time. We consider only those elliptic curves that
have no multiple roots (i.e. 4a3 + 27b2 6= 0).
On elliptic curve, addition of points can be defined in such a way that they form an Abelian
group. If the line through two different points P1 and P2 of an elliptic curve E intersects
E in a point Q = (x, y), then we define P1 + P2 = P3 = (x, −y). If the line through two
different points P1 and P2 is parallel with y-axis, then we define P1 + P2 = O. If P1 = P2
and the tangent to E in P1 intersects E in a point Q = (x, y), then we define P1 + P1 = P3 =
(x, −y). Now, it is easy to verify that the addition of points forms Abelian group with O as
the identity element.
Addition of points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) of an elliptic curve E : y 2 = x3 + ax + b
can be computed using the formula P1 + P2 = P3 = (x3 , y3 ), where x3 = λ2 − x1 − x2 and
y3 = λ(x1 − x3 ) − y1 and λ can be calculated as follows:
(
(y2 − y1 )(x2 − x1 )−1 if P1 6= P2
λ=
(3x21 + a)(2y1 )−1 if P1 = P2
56
8. E LLIPTIC C URVE C RYPTOGRAPHY AND FACTORIZATION
The points on an elliptic curve E : y 2 = x3 + ax + b (mod n) are such pairs (x, y) mod n
that satisfy the above equation, along with the point of infinity O. The addition of points on
an elliptic curve over a finite field is done the same way as described above. The number of
points on an elliptic curve over a finite field is limited by the Hasse’s theorem.
√
Hasse’s theorem: If an elliptic curve E (mod n) has N points, then |N − n − 1| < 2 n.
Let E be an elliptic curve and A, B its points such that B = kA for some k. The task to find
k is called the discrete logarithm problem for the elliptic curve E. No efficient algorithm to
compute discrete logarithm problem for elliptic curves is known and also no good general
attacks. Elliptic curves cryptography is based on these facts.
Every cryptosystem (protocol) based on discrete logarithm problem can be converted
into a cryptosystem (protocol) based on elliptic curves. The conversion goes as follows:
• To the point of an elliptic curve that results from the modified cryptosystem assign a
message (cryptotext).
8.5 Factorization
57
8. E LLIPTIC C URVE C RYPTOGRAPHY AND FACTORIZATION
8.6 Exercises
Exercise 8.1
Factorize the following numbers (Do not use just brute force; describe computation steps.)
1. 232 − 1
2. 264 − 1
3. 332 − 1
Using the quadratic sieve method and properties of Fermat’s numbers, we get:
Exercise 8.2
58
8. E LLIPTIC C URVE C RYPTOGRAPHY AND FACTORIZATION
Let n > 1. If for every prime factor q of n − 1 there is an integer a such that
n−1
an−1 ≡ 1 (mod n) and a q 6≡ 1 (mod n), then n is a prime.
In the first step of Lucas test we find out whether a and n are coprimes, in the second step
we test the order of a. If its order is equal to n − 1 then the size of the set Z∗n is n − 1 and
therefore n is a prime.
16
The number 216 has only one prime factor – 2. And we can see that 32 ≡ 1 (mod 216 +1)
216 15
and 3 2 = 32 ≡ −1 (mod 216 + 1). The number 216 + 1 passes the test and therefore it is a
prime.
Exercise 8.3
1. Compute A = n + 1 − ϕ(n).
2. Compute roots of the equation x2 − Ax + n and give explicit expressions for computing
p and q.
Solution 8.3.1
x2 − Ax + n = 0
x2 − (p + q)x + pq = 0
(x − p)(x − q) = 0
3. When we know the values of n and ϕ(n), we can find the factors of n. For n = 15 049
and ϕ(n) = 14 800 we get:
59
8. E LLIPTIC C URVE C RYPTOGRAPHY AND FACTORIZATION
Exercise 8.4
Consider the finite field K = GF (7) = Z7 . An elliptic curve Ea,b over K is defined by
3. For each point P ∈ E2,1 , compute − P and check that it lies on the curve as well.
Solution 8.4.1
1. To find all point of the curve E2,1 : y 2 = x3 + 2x + 1 we need to find the squares of all
elements of Z7 :
e 0 1 2 3 4 5 6
e2 mod 7 0 1 4 2 2 4 1
x 0 1 2 3 4 5 6
y 2 mod 7 1 4 6 6 3 3 5
For x = 0 we get points P1 = (0, 1) and P2 = (0, 6), for x = 1 we get another two
points P3 = (1, 2) and P4 = (1, 5). And there is another one point of E2,1 – the point
P0 = O: E2,1 = {O, (0, 1), (0, 6), (1, 2), (1, 5)}. Because the curve has no multiple roots
(4a3 + 27b2 = 4 · 23 + 27 = 3 > 0), its points forms a group.
2. We should verify the Hasse’s Theorem – and we can see that it holds:
√
|N − p − 1| < 2 p
√
|5 − 7 − 1| = 3 < 2 · 2 < 2 7
3. For each point P = (x, y) ∈ E2,1 we can find the point −P = (x, −y mod 7):
P −P
P0 = O −P0 = O = P0
P1 = (0, 1) −P1 = (0, −1 mod 7) = (0, 6) = P2
P2 = (0, 6) −P2 = (0, −6 mod 7) = (0, 1) = P1
P3 = (1, 2) −P3 = (1, −2 mod 7) = (1, 5) = P4
P4 = (1, 5) −P4 = (1, −5 mod 7) = (1, 2) = P3
And we can see that all of this points lies on the curve E2,1 .
60
8. E LLIPTIC C URVE C RYPTOGRAPHY AND FACTORIZATION
4. The group (E2,1 , +) is isomorphic to the group (Z5 , +). Let f : E2,1 → Z5 be a function
such that
f (O) 7→ 0
f ((0, 1)) 7→ 1
f ((1, 5)) 7→ 2
f ((1, 2)) 7→ 3
f ((0, 6)) 7→ 4
We can see that f is a homomorphous mapping (f (a +E2,1 b) = f (a) +Z5 f (b)) and we
can easily find the inverse homomorphous mapping g:
g(0) 7→ O
g(1) 7→ (0, 1)
g(2) 7→ (1, 5)
g(3) 7→ (1, 2)
g(4) 7→ (0, 6)
Exercise 8.5
Use the rho method with f (x) = x2 − 1 and x0 = 5 to find a factor of n = 7 031.
Solution 8.5.1
Using the Pollard’s rho method with function f (x) = x2 − 1, x0 = 5 and xi+1 = f (xi ) we get
the following factors of n = 7031:
x1 = x20 − 1 = 24
gcd(x1 − x0 , n) = gcd(19, 7031) = 1
x2 = x21 − 1 = 575
gcd(x2 − x0 , n) = gcd(570, 7031) = 1, gcd(x2 − x1 , n) = gcd(551, 7031) = 1
x3 = x22 − 1 = 167
gcd(x3 − x0 , n) = gcd(162, 7031) = 1, gcd(x3 − x1 , n) = gcd(143, 7031) = 1,
gcd(x3 − x2 , n) = gcd(6623, 7031) = 1
x4 = x23 − 1 = 6795
gcd(x4 − x0 , n) = gcd(6790, 7031) = 1, gcd(x4 − x1 , n) = gcd(6771, 7031) = 1,
gcd(x4 − x2 , n) = gcd(6220, 7031) = 1, gcd(x4 − x3 , n) = gcd(6628, 7031) = 1
x5 = x24 − 1 = 6478
gcd(x5 − x0 , n) = gcd(6473, 7031) = 1, gcd(x5 − x1 , n) = gcd(6454, 7031) = 1,
gcd(x5 − x2 , n) = gcd(5903, 7031) = 1, gcd(x5 − x3 , n) = gcd(6311, 7031) = 1,
gcd(x5 − x4 , n) = gcd(6714, 7031) = 1
61
8. E LLIPTIC C URVE C RYPTOGRAPHY AND FACTORIZATION
x6 = x25 − 1 = 3475
gcd(x6 − x0 , n) = gcd(3470, 7031) = 1, gcd(x6 − x1 , n) = gcd(3450, 7031) = 1,
gcd(x6 − x2 , n) = gcd(2900, 7031) = 1, gcd(x6 − x3 , n) = gcd(3308, 7031) = 1,
gcd(x6 − x4 , n) = gcd(3711, 7031) = 1, gcd(x6 − x5 , n) = gcd(4028, 7031) = 1
x7 = x26 − 1 = 3397
gcd(x7 − x0 , n) = gcd(3392, 7031) = 1, gcd(x7 − x1 , n) = gcd(3373, 7031) = 1,
gcd(x7 − x2 , n) = gcd(2822, 7031) = 1, gcd(x7 − x3 , n) = gcd(3230, 7031) = 1,
gcd(x7 − x4 , n) = gcd(3633, 7031) = 1, gcd(x7 − x5 , n) = gcd(3950, 7031) = 79,
gcd(x7 − x6 , n) = gcd(6953, 7031) = 1
To factor n = 7031 we can also use the second Pollard’s rho method.
79 is prime and it is a prime factor of 7031, the other prime factor is 7031/79 = 89.
Exercise 8.6
2. Prove that for each odd composite number n there are always at least two numbers
a ∈ Z∗n such that an−1 ≡ 1 (mod n).
3. Are there any numbers n for which the test (from (1)) fails for any a ∈ Z∗n ? Prove that
such numbers do not exist or give an example of such number.
1. The method is called Fermat’s primality test. It is based on the Fermat’s little theorem:
Let p be a prime, a any positive integer such that gcd(a, n) = 1, then an−1 ≡ 1
(mod n).
62
8. E LLIPTIC C URVE C RYPTOGRAPHY AND FACTORIZATION
It says that an odd positive integer n is composite if there exists a positive integer a
such that gcd(a, n) = 1 and an−1 6≡ 1 (mod n).
3. The test fails for Carmichael numbers. Carmichael numbers are not primes, but they
satisfy an−1 ≡ 1 (mod n) for all values of a such that gcd(a, n) = 1. The smallest
Carmichael number is 561.
Exercise 8.7
In 2002 three Indian scientists published the first deterministic polynomial algorithm decid-
ing the primality problem. The method uses the following theorem.
Let n > 1, a be integers such that gcd(a, n) = 1. Then n is a prime if and only if
(x + a)n = xn + a in Zn [x].
Since n is a prime,
n n! (n − 1)!
= =n .
k k!(n − k)! k!(n − k)!
Because n is a prime, for all 0 < l < n it holds gcd(l, n) = 1. Therefore gcd(k!(n − k)!, n) = 1
n
and n divides k for 0 < k < n. Hence (x + a)n = xn + an in Zn [x]. Using the Fermat’s little
xn + a in Zn [x].
63
Chapter 9
Most applications of cryptography ask for authentic data rather then secret data. A practi-
cally very important problem is how to protect data and communication against an active
attacker.
User identification is a process at which one party (called a prover) convinces another party
(called verifier) of prover’s identity and that the prover has actually participated in the iden-
tification process. The purpose of any identification process is to preclude impersonation
(pretending to be another person).
User identification has to satisfy following conditions:
• The verifier has to accept prover’s identity if both parties are honest.
• The verifier cannot later, after succesful identification, pose as a prover and identify
himself to another verifier (as the prover).
• A dishonest party that claims to be the other party has only negligible chance to iden-
tify himself successfully.
• If one party (verifier) gets a message from the other party (prover), then the verifier is
able to verify that the sender is indeed the prover.
• There is no way to pretend for a party when communicating with Bob, that he is Alice,
without Bob having a large chance to find out that.
Identification system can be based on any public key cryptosystem. The identification
goes as follows: Alice chooses a random r and sends eB (r) to Bob (eB is the encryption
algorithm for Bob). Alice identifies a communicating person as Bob, if he can send her back
r. Bob identifies a communicating person as Alice, if she can send him r.
Identification scheme can be also based on any one way function f and key k. Both Alice
and Bob share a key k and a one way function f . The identification goes as follows: Bob
sends Alice a random number or string r. Alice sends Bob P = f (k, r). If Bob gets P , then
he verifies whether P = f (k, r). If yes, he starts to believe that the communicating person is
Alice. The process can be repeated to increase the probability of correct identification.
64
9. U SER I DENTIFICATION , M ESSAGE A UTHENTICATION AND S ECRET S HARING
The goal of the data authentication protocols is to handle the case that data are sent through
insecure channels. By creating so-called Message Authentication Code (MAC) and sending
this MAC together with the message through an insecure channel, the receiver can verify
whether data were not changed in the channel. The price to pay is that the communicating
parties need to share a secret random key that needs to be transmitted through a very secure
channel.
The basic difference between MACs and digital signatures is that MACs are symmetric.
Anyone who is able to verify MAC of a message is also able to generate the same MAC
and vice versa. A scheme (M, T, K) for data authentication is given by a set of possible
messages (M ), a set of possible MACs (T ) and a set of possible keys (K). It is required
that to each key k from K there is a single and easy to compute authentication mapping
authk : {0, 1}∗ × M → T and a single easy to compute verification mapping verk : M × T →
{true, f alse}. An authentication scheme should also satisfy the condition of correctness: For
each m from M and k from K it holds verk (m, c) = true if there exists an r from {0, 1}∗
such that c = authk (r, m); and the condition of security: For any m from M and k from
K it is computationally unfeasible (without the knowledge of k) to find c from T such that
verk (m, c) = true.
Secret sharing schemes distribute a secret among several users in such a way that only pre-
defined sets of users can recover the secret.
Let t ≤ n be positive integers. A (n, t)-threshold scheme is a method of sharing a secret
S among a set P of n participants, P = {Pi | 1 ≤ i ≤ n}, in such a way that any t, or
more, participants can compute the value S, but no group of t − 1, or less, participants can
compute S. Secret S is chosen by a dealer D ∈ / P . It is assumed that the dealer distributes the
secret to participants secretly and in such a way that no participant knows shares of other
participants.
Initiation phase: Dealer D chooses a prime p > n, n distinct xi , 1 ≤ i ≤ n and D gives the
value xi to the user Pi . The values xi are public.
Share distribution phase: Suppose D wants to share secret S ∈ Zp among the users. D
randomly chooses t − 1 elements from Zp , a1 , . . . at−1 . For 1 ≤ i ≤ n D computes the
shares yi = f (xi ), where f (x) = S + t−1 j
P
j=1 aj x mod p. D gives the computed share yi
to the participant Pi .
Secret cumulation phase: Let participants Pi1 , . . . , Pit want to determine secret S. Since f (x)
has degree t − 1, f (x) has the form f (x) = a0 + a1 x + · · · + at−1 xt−1 , and coefficients am
can be determined from t equations f (xij ) = yij , where all arithmetics is done modulo
p. It can be shown that equations obtained this way are linearly independent and the
system has only one solution. In such a case we get S = a0 .
65
9. U SER I DENTIFICATION , M ESSAGE A UTHENTICATION AND S ECRET S HARING
9.4 Exercises
Exercise 9.1
1. Show that the verification is correct if both Prover and Verifier follow the instructions.
2. What happens if Verifier chooses e = 0? Compute l, r and λ for this case. Does Verifier
learns something about Prover’s private key?
3. Show that Verifier, who sends e = 1 as his challenge, can learn (with high probability)
one bit of Prover’s private key after a few runs of the protocol. Compute l, r and λ for
this case.
Solution 9.1.1
66
9. U SER I DENTIFICATION , M ESSAGE A UTHENTICATION AND S ECRET S HARING
Exercise 9.2
Consider Shamir’s (10, 3)-secret sharing scheme over Zp where p is a large prime. There is
one cheating share holder. His goal is to give a bad share in the secret cumulation phase. The
point is that nobody knows which share holder is the cheater.
1. Describe a method to reconstruct the secret given all 10 shares and explain why it
works.
2. Determine the smallest number x of shares that are sufficient to reconstruct s. Explain.
3. Let us take any collection of fewer than x share holders. Can they obtain any informa-
tion about the secret? Explain.
1. We can compute secrets s1 , s2 , s3 for three disjoint sets of shares. We obtain at least two
same secrets si = sj = s, because the cheater’s bad share can be in only one set. If we
have three disjoint sets of three shares and s1 = s2 = s3 , we know that the cheater is
the one, not being in any set.
2. We know, that x must be greater then 3, otherwise we cannot reconstruct the secret.
If x = 4, then we obtain up to 43 = 4 different secrets. If all four reconstructed secrets
are the same, then there is no cheater in the group and we get the secret s. If the recon-
structed secrets are all different or do not exist, there is the cheater in the group and
only one of these secrets is our secret s – and we cannot find out which one it is.
If x = 5, then we obtain 53 = 10 possible secrets. If there is the bad share, then the se-
cret s appears 43 = 4 times. The other 6 secrets are different or do not exist. Therefore,
the smallest number x of shares sufficient to reconstruct the secret s is 5.
67
9. U SER I DENTIFICATION , M ESSAGE A UTHENTICATION AND S ECRET S HARING
3. If there are less then three share holders, they cannot obtain any information about the
secret.
Any three share holders can compute some secret, but they cannot be sure whether
(93) 7
they have recover the secret s. The probability, that they have found it is 10 = 10 =
(3)
70 %, what is not bad.
A group of four share holders can compute up to four possible secrets. If they are
all equal, they reconstructed the secret s. Otherwise, if one of the share holders is the
cheater, they know, that only one of the computed secrets is the secret s.
Exercise 9.3
1. Suppose all subjects share a secret key k. Sender S adds the MAC to every message he
sends using k and each receiver verifies it. Explain why this scheme is insecure.
2. Suppose sender S has a set A = {k1 , . . . , km } of m secret keys. Each receiver Ri has
some subset Ai ⊆ A of the keys. Before sending a message, S computes MAC ci of the
message for each key ki . Then S sends all MACs c1 , . . . , cm with the message. When
receiver Ri receives a message, he accepts it as authentic if and only if all MACs corre-
sponding to keys in Ai are valid. Which property should sets A1 , . . . , An satisfy to be
resistant to the attack from (1). Assume that the receivers cannot collude.
3. Suppose that n = 6. Show that it is sufficient for the sender to append 4 MACs to every
message to satisfy the condition derived in (2). Describe sets A1 , . . . , A6 ⊆ {k1 , . . . , k4 }.
Solution 9.3.1
1. When all receivers R1 , . . . , Rn share the same key k for verifying that the received mes-
sage was sent by S, each of them can calculate the MAC using the key k for any mes-
sage m of his choice. When this cheater broadcasts this message m and its MAC calcu-
lated using the key k of S, all receivers verify that the message was sent by S.
3. This scheme is secure because we can make 6 different sets Ai such that |Ai | = 2 and
Ai ⊂ {k1 , k2 , k3 , k4 }. When we have elements k1 , k2 , k3 and k4 we can get six different
pairs of them because
4 4! 4·3
= = = 6.
2 2!2! 2
68
9. U SER I DENTIFICATION , M ESSAGE A UTHENTICATION AND S ECRET S HARING
Now, there is no receiver Ri who can make another receiver think that the sender was
someone else.
Exercise 9.4
Alice wishes to prove to Bob that she really does know the private key d corresponding to
her RSA public key (n, e). They decide to use the following protocol:
1. Prove that the protocol is zero knowledge under the assumption that Bob is honest.
2. A dishonest Bob can use this protocol to decrypt messages sent to Alice by someone
else. He only sends the message he wants to decrypt (with some salt, so that Alice does
not recognize, that it is message sent to her) to Alice, Alice decrypts it and sends it back
to Bob.
Bob can also misuse this protocol. He can send a message m to Alice (not encrypted
with her public key), Alice sends him back me mod n, what is Alice’s signature of mes-
sage m. Then, Bob can send any message m pretending that the sender is Alice.
69
Chapter 10
A protocol is an algorithm two (or more) parties have to follow to perform a communication.
A cryptographical protocol is a protocol to achieve secure communication during some goal
oriented cooperation.
In a bit commitment protocol Alice chooses a bit b and gets committed to b, in the sense, that
Bob has no way of knowing which commitment Alice had made, and Alice has no way of
changing her commitment once she has made it (after Bob announces his guess as to what
Alice has chosen).
The basis of bit commitment protocols are bit commitment schemes. A bit commitment
scheme is a mapping f : {0, 1} × X → Y , where X and Y are finite sets. A commitment to
bit b ∈ {0, 1} is any value f (b, x) for x ∈ X. Each bit commitment protocol has two phases –
the commitment phase and the opening phase. In the commitment phase, the sender sends
a bit b he wants to commit to (in an encrypted form) to the receiver. In the opening phase,
the sender sends to the receiver information that enables the receiver to get the bit b.
Each bit commitment scheme should have three properties:
Binding: Alice can open her commitment B by revealing x and b such that B = f (b, x), but
she should not be able to open a commitment B with both 0, 1.
Correctness: If both, the sender and the receiver, follow the protocol, then the receiver will
always learn the commitment b.
The oblivious transfer problem: Design a protocol for sending messages from Alice to Bob in
such a way that Bob receives the message with probability 12 and garbage otherwise. More-
over, Bob knows whether he got the message or garbage, but Alice has no idea which one he
got.
The 1-out-of-2 oblivious transfer problem: Alice sends two messages to Bob in such a
way that Bob can choose which of the messages he receives (but he cannot choose both of
them), but Alice cannot learn Bob’s decision.
70
10. B IT C OMMITMENT P ROTOCOLS AND Z ERO K NOWLEDGE P ROOFS
One of the most important, and at the same time very counterintuitive, primitives for crypto-
graphic protocols are so called zero knowledge proof protocols. We can say that zero knowl-
edge proof protocol allows one party, usually called prover, to convince another party, called
verifier, that prover knows some facts without revealing to the verifier any information about
his knowledge. Zero knowledge proof protocols are a special type of so called interactive
proof systems.
An interactive proof system has the property of being zero knowledge if verifier, who
interacts with the honest prover, learns nothing from the interaction beyond the validity of
the statement being proved. There are several variants of zero knowledge, that differs in the
way how ”learning nothing” is specified.
In an interactive proof system, there are two parties: a prover, often called Peggy (a ran-
domized algorithm using a private random number generator), and a verifier, often called
Vic (a polynomial time randomized algorithm using a private random number generator).
Prover knows some secret, or knowledge, or a fact about a specific object, and wishes to
convince the verifier, through a communication with him, that he has this knowledge.
The interactive proof system consists of several rounds. In each round prover and verifier
alternatively do the following: receive a message from the other party, perform a private
computation and send a message to the other party. The communication starts usually by a
challenge of verifier and a response of prover. At the end, verifier either accepts or rejects
prover’s attempts to convince him.
A zero knowledge proof of a theorem T is an interactive two party protocol, in which
prover is able to convince verifier who follows the same protocol, by the overwhelming
statistical evidence, that T is true, if T is really true, but no prover is able to convince verifier,
that T is true, if T is not true. In addition, during the interaction, the prover does not reveal
to verifier any other information, except whether T is true or not. Therefore, after verifier
gets convinced, he can only believe that T is true.
With the following protocol Peggy can convince Vic that a particular graph G, known to both
of them, is 3-colorable and that Peggy knows such a coloring, without revealing to Vic any
information how such coloring looks.
Peggy colors the graph G = (V, E) with three colors and then she perform with Vic |E|2
times the following interaction, where v1 , . . . , vn are vertices of V .
2. Vic chooses an edge and ask Peggy to show him coloring of the adjacent vertices.
3. Peggy shows Vic the colors and encryption procedures corresponding to the selected
vertices.
71
10. B IT C OMMITMENT P ROTOCOLS AND Z ERO K NOWLEDGE P ROOFS
4. Vic performs encryption to verify that vertices really have colors as shown.
10.5 Exercises
Exercise 10.1
There is a cryptographic conference in Monaco. The best student of a cryptographic course
will be allowed to participate. Keiko and Hiroki are students with the maximum number of
points from exercises. Unfortunately, only one of them is allowed to participate so they have
to decide which one. Hiroki is now abroad, therefore Keiko suggest the following protocol
that allows them to remotely flip a coin.
• Keiko chooses either x = ”HEAD” or x = ”TAIL” and picks a random number k. She
encrypts x with DES cipher using the key k. She obtains y = DESk (x).
• Keiko reveals k.
• Hiroki decrypts y with DES using the key k and obtains the guess of Keiko. If Keiko’s
guess is correct, she travels to Monte Carlo.
Is Keiko able to cheat?
Solution 10.1.1
Keiko would be able to cheat only if she knew such keys k1 , k2 that DESk1 (HEAD) =
y = DESk2 (T AIL). To find such keys, she can built two lists DESk1 (HEAD), k1 ) and
(DESk2 (T AIL), k2 ). Both lists are sorted according to the first field of each entry. Keiko then
looks for collisions between the two lists and obtains keys k1 , k2 , such that DESk1 (HEAD) =
DESk2 (T AIL).
Then when she sends y to Hiroki she learns what face of coin is up. Then she can send
back to Hiroki such ki that DESki (x) = y where x is the face.
Because of the computational complexity, it is not easy to find such k1 and k2 ; by the
Birthday paradox, we need to perform about 232 DES evaluations for getting one collision.
Therefore, we can say that Keiko is not able to cheat.
Exercise 10.2
Let p be a large prime. Let g be a generator of the group (Z∗p , ·). Discuss the security of the
following commitment scheme.
• To commit to m ∈ {0, 1, . . . , p − 1}, Alice randomly picks r ∈ {0, 1, . . . , p − 1} and sends
c = g r m (mod p) to Bob.
72
10. B IT C OMMITMENT P ROTOCOLS AND Z ERO K NOWLEDGE P ROOFS
1. The protocol is hiding if m > 0 because when Bob gets c, he knows that c = g k mod p.
But he doesn’t know the two elements l1 and l2 such that l1 + l2 = k. So he is not able
to learn anything about m.
There is only one exception: if m = 0 then c = 0 irrespective of r that Alice chooses (g
is a generator of group Z∗p , · and there is no s such that g s = 0 because 0 ∈
/ Z∗p ). And
because m = 0 is allowed, the protocol is not hiding.
2. The protocol is not binding because Alice can choose two distinct r1 , r2 ∈ {0, . . . , p − 1}
and commit to m = g r2 . Then she sends to Bob c = g r1 m = g r1 g r2 = m0 g r2 and that
means that Alice can open her commitment with m and r1 or with m0 and r2 and it is
up to her which of the two pairs she sends to Bob.
Exercise 10.3
Consider the following implementation of 1-out-of-2 oblivious transfer which uses standard
oblivious transfer as the underlying primitive:
• Bob wants to learn Alice’s bit bs . He randomly chooses subset Is ⊆ I of size n and
I1−s ⊆ {1, . . . , m} \ I also of size n. He sends I0 , I1 in this order to Alice.
L
• Alice checks that I0 and I1 are of the correct form. She computes ci = bi ⊕ j∈Ii rj ,
where i ∈ {0, 1} and sends c0 , c1 (in this order) to Bob.
L
• Bob computes bs = cs ⊕ j∈Is rj .
6. Why does Alice need to check correctness of I0 and I1 in the third step?
73
10. B IT C OMMITMENT P ROTOCOLS AND Z ERO K NOWLEDGE P ROOFS
Solution 10.3.1
1. The protocol fails if Bob learns less then n bits of R because then he cannot construct
the set Is . If Bob learns more then 2n bits of R then the protocol also fails because Bob
is not able to construct the correct set I1−s .
3. Cheating Bob can learn both values bs and b1−s only if he knows more then or equal to
2n bits of R. When he knows less then 2n bits of R then he cannot learn anything about
b1−s .
4. She cannot learn anything about s because the only information she gets from Bob is
the two distinct sets I0 and I1 . The information about the set I is hidden to her and so
she doesn’t know which set Is is the subset of I.
5. When Bob knows more then or equal to 2n bits of R. Then he construct such sets I0 and
I1 that I0 , I1 ⊆ I and I0 ∩I1 = ∅. Because he knows all the values ri where i ∈ I0 ∪I1 ⊆ I,
he can compute both values b0 and b1 .
6. Alice need to check the correctness of I0 and I1 because if |Is | < n then the probability
that I0 ∪ I1 ⊆ I increases so as insecurity. If I0 ∩ I1 = A 6= ∅ then the probability that
I0 ∪ I1 ⊆ I also increases.
7. If m = 2n than the security of the protocol increases but the probability that Bob learns
at least n bits of R decreases. However, if Bob is lucky and learns more then n of R,
he cannot follow the protocol correctly, because he is unable to construct correct sets I0
and I1 .
If m = 5n than Bob learns approximately m 2 = 2, 5n bits of R and so he can learn both
b0 and b1 with higher probability and hence the protocol is less secure.
Exercise 10.4
Consider the zero knowledge proof protocol for 3-colorability of graphs that was described
in the section 10.4.
1. Suppose Peggy does not know 3-coloring of a 3-colorable graph G = (V, E), where
|V | = n and |E| = m. What is the maximal probability that Peggy makes Vic accept her
proof in single iteration of the protocol? Explain.
2. Suppose Peggy is honest but her random number generator is faulty. The identity per-
mutation is chosen with probability 12 and each of the other permutations is chosen
1
with probability 10 . Explain how cheating Vic can discover 3-coloring of G with high
probability after sufficiently many iterations of the protocol.
74
10. B IT C OMMITMENT P ROTOCOLS AND Z ERO K NOWLEDGE P ROOFS
1. Peggy does not know the 3-coloring of graph G, but according to the protocol, she must
commit to a permutation of her coloring to Vic. Once committed, Peggy cannot change
the coloring. Vic then chooses a random edge and asks Peggy to reveal the coloring
of its adjacent vertices. Peggy cannot lie and Vic can check, whether the vertices have
different color.
Since Peggy does not know the coloring, she must color the graph randomly. The prob-
ability, that both vertices have the same color, for any given pair of vertices, is 13 .
Hence, after one iteration of the protocol, the probability of Vic accepting the proof is
2 2 k
3 . After k iteration of the protocol, the probability would be ( 3 ) .
2. With the broken generator, Vic will be able to determine the coloring of an arbitrary
pair of adjacent vertices. In every iteration of the protocol, he just has to choose the
same edge (the one connecting the aforementioned vertices), until a sufficient number
of colorings is retrieved. Then, statistically, about half of the colorings will be the same.
Such a dominant coloring is the Peggy’s original coloring. Thus, Vic retrieves a coloring
for the two vertices.
Now, Vic merely has to use this procedure repeatedly, until the coloring of all vertices
is revealed.
75
Bibliography
[1] Baignères, Thomas, et al.: A classical introduction to cryptography exercise book. New
York : Springer, 2006. ISBN 0387279342.
[2] Kahn, David A.: The codebreakers : the comprehensive history of secret communica-
tion from ancient times to the Internet : the story of secret writing. New York : Scribner,
1996. ISBN 0684831309.
[4] Stinson, Dougles R.: Cryptography : theory and practice. Boca Raton : CRC Press, 2002.
ISBN 1584882069.
[5] Introduction to ECC [Online]. Certicom Inc., 2007 [cited 2007 May 12]. Available from
<http://www.certicom.com/index.php?action=ecc,about_ecc>.
[6] QuickMath : Automatic math solutions [Online]. 1999–2007 [cited 2007 May 12]. Avail-
able from <http://www.quickmath.com/>.
[8] Wikipedia contributors: Linear code [Online]. Wikipedia, The Free Encyclopedia; last
revision 19 April 2007 19:39 UTC [cited 2007 May 12]. Available from <http://en.
wikipedia.org/w/index.php?title=Linear_code&oldid=124164759>.
[9] Wikipedia contributors: Hamming code [Online]. Wikipedia, The Free Encyclo-
pedia; last revision 12 May 2007 16:46 UTC [cited 2007 May 12]. Available from
<http://en.wikipedia.org/w/index.php?title=Hamming_code&oldid=
130352788>.
[11] Wikipedia contributors: Digital signature [Online]. Wikipedia, The Free En-
cyclopedia; last revision 9 May 2007 01:12 UTC [cited 2007 May 12]. Avail-
able from <http://en.wikipedia.org/w/index.php?title=Digital_
signature&oldid=129400670>.
76
10. B IT C OMMITMENT P ROTOCOLS AND Z ERO K NOWLEDGE P ROOFS
[12] Wikipedia contributors: Secret sharing [Online]. Wikipedia, The Free Ency-
clopedia; last revision 3 May 2007 08:20 UTC [cited 2007 May 12]. Avail-
able from <http://en.wikipedia.org/w/index.php?title=Secret_
sharing&oldid=127905516>.
[13] Wikipedia contributors: Zero-knowledge proof [Online]. Wikipedia, The Free En-
cyclopedia; last revision 27 April 2007 19:25 UTC [cited 2007 May 12]. Available
from <http://en.wikipedia.org/w/index.php?title=Zero-knowledge_
proof&oldid=126457594>.
77