2 Classical Encryption Technique
2 Classical Encryption Technique
Terminology:
Plaintext- original message
Ciphertext – coded message
Enciphering, encryption – process of converting from plaintext to
ciphertext
Deciphering, decryption – restoring the plaintext from the ciphertext
Cryptography – area of study schemes for enciphering
Cryptographic system, cipher – scheme of enciphering
Cryptanalysis – techniques for deciphering a message without
knowledge of the enciphering details
Cryptology – areas of cryptography and cryptanalysis
OUTLINE
1. SYMMETRIC CIPHER MODEL
2. SUBSTITUTION TECHNIQUES
3. TRANSPOSITION TECHNIQUES
4. ROTOR MACHINES
5. STEGANOGRAPHY
1
CRYPTOGRAPHY
Cryptographic systems (cryptosystems) are characterized by
1. The type of operations used for transforming plaintext to ciphertext
(substitution, transposition). Fundamental requirement – no information be
lost
2. The number of keys used (1 key – symmetric, single-key, secret-
key; 2 keys – asymmetric, two-key, public-key)
3. The way in which the plaintext is processed (block cipher, stream
cipher). Stream cipher may be viewed as a block cipher with block size
equal to 1 element.
2
possession of a number of ciphertexts together with the plaintext that
produced each ciphertext
2. Sender and receiver must have obtained copies of the secret key in
a secure fashion and must keep the key secure. If someone can discover the
key and knows the algorithm, all communication using this key is readable
We assume that it is impractical to decrypt a message on the basis of
the ciphertext plus knowledge of the encryption/decryption algorithm, i.e.,
we do not need to keep the algorithm secret; we need to keep only the key
secret.
Let’s consider the essential elements of a symmetric encryption
scheme:
We can write:
Y=EK(X)
X= DK(Y)
Opponent knows Y, E, D. He may be interested in recovering X or/and
K. Knowledge of K allows him to read future messages.
3
CRYPTANALYSIS
There are two general approaches to attacking a conventional
encryption scheme:
1. Cryptanalysis: attempts to use characteristics of the plaintext or
even some plaintext-ciphertext pairs to deduce a specific plaintext or key
being used
2. Brute-force attack: every possible key is tried until an intelligible
translation into plaintext is obtained. On average, half of all possible keys
should be tried to achieve success.
4
substitution codes that use 26-character key, in which all possible
permutations of the 26 characters serve as keys. It is assumed that it takes 1
μs to perform a single decryption or encryption (in last column – 106
decryptions per 1 μs)
5
SUBSTITUTION TECHNIQUE
A substitution technique is one in which the letters of plaintext are
replaced by other letters or by numbers. If the plaintext is viewed as a
sequence of bits, then substitution involves replacing plaintext bit patterns
with ciphertext bit patterns
CAESAR CIPHER
It was used by Julius Caesar. The Caesar cipher involves replacing
each letter of the alphabet with the letter standing three places further down
the alphabet
For example
Plain: meet me after the toga party
Cipher: PHHW PH DIWHU WKH WRJD SDUWB
Transformation is made using the following mapping:
Plain: a bc d efgh i j k l mno pq r s t u v wxy z
Cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
Let us assign a numerical equivalent to each letter from 0 to 25. Then
the algorithm may be expressed as follows. For each plaintext letter p,
substitute the ciphertext letter C:
C=E(p)=(p+3) mod 26
A shift may be of any amount, so that general Caesar algorithm is
C=E(p)=(p+k) mod 26,
where k takes on a value in the range 1 to 25. The decryption algorithm is
simply
p=D(C)=(C-k) mod 26
If it is known that a given ciphertext is a Caesar cipher, then a brute-
force cryptanalysis is easily performed: simply try all possible 25 keys.
6
Three important characteristics of this problem enable us to use brute-
force cryptanalysis:
1. The encryption and decryption algorithms are known
2. There are only 25 keys to try
3. The language of the plaintext is known and easily recognizable
In most networking situations algorithms are assumed to be known.
Brute-force analysis is impractical when algorithm employs large size of
keys. The 3rd characteristic is also significant. If the language of the plaintext
is not known, then the plaintext output may not be recognizable.
7
CAESAR CIPHER (CONT)
8
CAESAR CIPHER (CONT)
Furthermore, if the input is compressed in some manner, again
recognition is difficult. Below is example of compression by ZIP:
MONOALPHABETIC CIPHERS
With only 25 keys Caesar cipher is far from secure. A dramatic
increase in the key space may be achieved by allowing an arbitrary
substitution. If instead of
Plain: a bc d e fgh i j k l mno pq r s t u vwxy z
Cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
the cipher line can be any permutation of the 26 alphabetic symbols, then
there are 26! or greater than 4*1026 possible keys. There is however another
line of attack. If the cryptanalyst knows the nature of the plaintext (e.g., non-
compressed English text), then the analyst can exploit the regularities of the
language.
9
MONOALPHABETIC CIPHERS (CONT)
Let’s consider example of ciphertext:
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
10
MONOALPHABETIC CIPHERS (CONT)
11
MONOALPHABETIC CIPHERS (CONT)
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
t a e e te a t h at e e a a
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
e t ta t ha e ee a e th t a
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
e e e tat e t he et
Continued analysis of frequencies plus trial and error may lead us to
the solution:
12
VIGENERE CIPHER
The best known and one of the simplest is Vigenere cipher. The
Vigenère cipher is a method of encryption invented by Giovan Batista
Belaso and described in his 1553 book La cifra del. Sig. Giovan Batista
Belaso. It was misattributed to Blaise de Vigenère in the 19th century, and
given his name ( http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher ).
In this scheme, the set of related monoalphabetic substitution rules
consists of the 26 Caesar ciphers, with shifts from 0 to 25. Each cipher is
denoted by a key letter, which is the ciphertext letter that substitutes for the
plaintext letter a. Thus, a Caesar cipher with a shift 3 is denoted by the key
value d.
A matrix known as Vigenere tableau is used:
13
VIGENERE CIPHER (CONT)
a b c d e f g h i j k l m n o p q r s t u v w x y z
a A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
b B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
c C D E F G H I J K L M N O P Q R S T U V W X Y Z A B
d D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
e E F G H I J K L M N O P Q R S T U V W X Y Z A B C D
f F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
g G H I J K L M N O P Q R S T U V W X Y Z A B C D E F
h H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
i I J K L M N O P Q R S T U V W X Y Z A B C D E F G H
j J K L M N O P Q R S T U V W X Y Z A B C D E F G H I
k K L M N O P Q R S T U V W X Y Z A B C D E F G H I J
l L M N O P Q R S T U V W X Y Z A B C D E F G H I J K
m M N O P Q R S T U V W X Y Z A B C D E F G H I J K L
n N O P Q R S T U V W X Y Z A B C D E F G H I J K L M
o O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O
q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P
r R S T U V W X Y Z A B C D E F G H I J K L M N O P Q
s S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
t T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
u U V W X Y Z A B C D E F G H I J K L M N O P Q R S T
v V W X Y Z A B C D E F G H I J K L M N O P Q R S T U
w W X Y Z A B C D E F G H I J K L M N O P Q R S T U V
x X Y Z A B C D E F G H I J K L M N O P Q R S T U V W
y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X
z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
Each of the 26 ciphers is laid out horizontally, with the key letter for
each cipher to the left. The encryption process:
14
VIGENERE CIPHER (CONT)
Given a key letter x and a plaintext letter y, the ciphertext letter is at
the intersection of the row labeled x and the column labeled y; in this case
the ciphertext letter is V. To encrypt a message, a key is needed that is as
long as the message. Usually, a key is a repeating keyword. For example, if
the keyword is deceptive, the message “we are discovered save yourself” is
encrypted as follows:
Key: dec e p t i v e de c e p t i ve d e c e p t i v e
Plaintext: wea r e d i s c ov e r e d s av e y o u r s e l f
Ciphertext: ZICVTWQNGRZGVTWAVZHCQYGLMGJ
Decryption is equally simple. The key letter identifies the row. The
position of ciphertext letter in that row determines the column, and the
plaintext is at the top of that column.
In spite of using multiple alphabets, some frequency information is
preserved in Vigenere ciphertext.
Let’s sketch a method of breaking this cipher.
Suppose that the opponent believes that the ciphertext was encrypted
using either monoalphabetic substitution or a Viginere cipher. A simple test
can be made to make a determination. If a monoalphabetic substitution is
used, then the statistical properties of the ciphertext should be the same as
that of the language of the plaintext. If, on the other hand, a Viginere cipher
is suspected, then progress depends on determining the length of the
keyword, as it will be seen in a moment. How keyword length can be
determined? If 2 identical sequences of the plaintext letters occur at a
distance of integer multiple of the keyword length, they will generate
identical ciphertext sequences. In our example, 2 instances of the sequence
“red” are separated by 9 character positions. Consequently, in both cases, r is
encrypted using key letter e, e is encrypted using key letter p, and d is
encrypted using key letter t. Thus, in both cases ciphertext is VTW. Analyst
may make assumption, that keyword length is either 3, either 9. Having long
enough messages, cryptanalyst can determine keyword length definitely by
finding common factor of all displacements of such sequences.
If keyword length is N, then the cipher consists of N monoalphabetic
substitution ciphers. For example, with the keyword DECEPTIVE, the
letters in positions 1, 10, 19, and so on, are all encrypted with the same
monoalphabetic cipher. Thus, we can use the known frequency
characteristics of the plaintext language to attack each of the monoalphabetic
ciphers separately.
15
This scheme is vulnerable to cryptanalysis, because the key and the
plaintext share the same frequency distribution of letters, so a statistical
technique can be applied.
The periodic nature of the keyword can be eliminated by using a non-
repeating keyword that is as long as message.
.
The ultimate defense against such a cryptanalysis is to choose a
keyword that is as long as the plaintext and has no statistical
relationship to it. Such a system was introduced by an AT&T engineer
Gilbert Vernam in 1918. His system works on binary data rather than letters.
The system can be expressed succinctly as follows:
ci = pi ki ,
where Ci- ith binary digit of ciphertext, Pi – of the plaintext, Ki – of the key,
- exclusive or (XOR) operation
Decryption is made by
pi = ci ki
Keyword here is long enough but repeating. It can be broken with the
use of known plaintext sequences.
ONE-TIME PAD
An US Army Signal Corps Captain, Joseph Mauborgne, in 1918,
proposed an improvement (http://en.wikipedia.org/wiki/One-time_pad ) to
Vernam cipher that yields the ultimate in security. He suggested using of a
random key that was truly as long as the message, with no repetitions.
Such a scheme, known as one-time pad, is unbreakable. It produces random
output that bears no statistical relationship to the plaintext. Because the
ciphertext contains no information whatsoever about the plaintext, there is
no way to break the code.
But in practice, one-time pad has 2 fundamental difficulties:
- there is the practical problem of making large quantities of random
keys. Any heavily used system might require millions of random characters
on a regular basis. Supplying truly random characters in this volume is a
significant task
- the problem of key distribution and protection. For every message
to be sent, a key of equal length is needed by both sender and receiver. Thus,
the key distribution problem exists.
16
TRANSPOSITION TECHNIQUE
Another approach to enciphering is the usage of transpositions, or
permutations on the plaintext letters. The simplest such cipher is the rail
fence technique, in which the plaintext is written down as a sequence of
diagonals and then read off as the sequence of rows. For example, to
encipher the message “meet me after the toga party” with a rail fence of
depth 2, we write
mematrhtgpry
et efeteoaat
The encrypted message is
MEMATRHTGPRYETEFETEOAAAT
A more complex scheme is to write the message in a rectangle, row by
row, and read the message off, column by column, but to permute the order
of columns. The order of columns then becomes the key to the algorithm.
For example,
Key: 4312567
Plaintext: a t t a c k p
os t pone
dun t i l t
woamxyz
Ciphertext: TTNAAPTMTSUOAODWCOIXKNLYPETZ
A pure transposition cipher is easily recognized because it has the
same letter frequencies as the original plaintext.
The transposition cipher can be made more secure by performing
more than 1 transposition
17
HILL CIPHER
It was developed by the mathematician Lester Hill in 1929. The
encryption algorithm takes m successive plaintext letters and substitutes for
them m ciphertext letters. The substitution is determined by m linear
equations in which each character is assigned a numerical value:
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
K=
17 17 5
21 18 21
2 2 19
The first 3 letters of the plaintext are represented by the vector (15 0
24). Then K(15 0 24) = (375 819 486) mod 26 = (11 13 18) = LNS.
18
Continuing in this fashion, the ciphertext for the entire plaintext is
LNSHDLEWMTRW.
Decryption requires using the inverse of the matrix K. The inverse K-1
of a matrix K is defined by K K-1 = K-1 K=I, where I is the unit matrix (1-s
on the diagonal, other elements – zeroes). The inverse of the matrix does not
always exist, but when it does, it satisfies the preceding equation. In this
case, the inverse is
K-1=
4 9 15
15 17 6
24 0 17
19
ATTACKING HILL CIPHER
Although the Hill cipher is strong against a ciphertext-only attack
(opponent has only ciphertext), it is easily broken with a known plaintext
attack (opponent has pairs plaintext – ciphertext). For an m*m Hill cipher,
suppose we have m plaintext-ciphertext pairs, each of length m. We label the
pairs Pj=(p1j, p2j,…, pmj) and Cj=(c1j, c2j,…, cmj) such that Cj=KPj for
1<=j<=m and for some unknown key matrix K. Now define two m*m
matrices X=( pij) and Y=( cij). Then we can form matrix equation Y=KX. If
X has an inverse, then we can determine K=YX -1. If X is not invertible, then
a new version of X can be formed until an invertible X is obtained.
Suppose that the plaintext “friday” is encrypted using a 2*2 Hill cipher
to yield the ciphertext PQCFKU. Thus, we know that
K(5 17) = (15 16);
K(8 3) = (2 5);
K(0 24) = (10 20).
Using the first 2 plaintext-ciphertext pairs, we have
15 2 5 8
= K mod 26
16 5 17 3
20
7 8 5 35 + 136 171 15
= = mod 26 =
19 3 17 95 + 51 146 16
7 8 8 56 + 24 80 2
= = mod 26 =
19 3 3 152 + 9 161 5
7 8 0 192 10
= mod 26 =
19 3 24 72 20
21