Lab1 Manual
Lab1 Manual
ENRG-UH 4322
RSA utilizes clever mathematics to make this possible. You can find information on RSA on the
internet, see this video for a simplified explanation of the RSA algorithm. The basic operations
involved require the following digits:
The numbers: N, E, and D are not generated randomly, however the generation of these
numbers is not within the scope of the lab. An example of one set of such numbers:
- N: 14
- E: 5
- D: 11
To encrypt the character ‘B’ for example, first it must be represented by a number, it can be
represented by its ascii value, but for simplicity, it will be represented by the number 2 since it’s
the second letter. To encrypt this character:
To decrypt,
You can try these calculations yourself with other plaintext characters.
The Side Channel
Exponentiation in modular arithmetic1 happens all the time in RSA. However, unlike the simple
example above, the numbers would be much larger. This would make the modulo operations,
such as modular exponentiation operations much slower, and as such, to improve performance,
implementations of RSA rely on exponentiation by squaring which requires less operations. As
part of the algorithm (see figure 1), there is an operation that occurs only when the bits of the
exponents are 1.
RSA can be further optimized by utilizing Montgomery Multiplication and the Chinese
Remainder Theorem or CRT2. CRT is used to optimize RSA by checking to see whether or not y
is larger than p when performing the operation y mod p. If y is larger than p, then p is taken
subtracted from y at least once, this potential extra subtraction introduces some variation in the
computation time.
Whilst these optimizations are good for performance, they do introduce leakage. This extra
computation means that the processor will take more time, if the time needed for a modular
exponentiation operation can be measured, depending on the time it took, the key bits can be
deduced one by one.
Attacker Model
For the attack to work, it is assumed that the attacker has access to the input messages, the
output signatures, and the amount of time it took to sign the message. For the sake of
simulation, the code includes a cpp file called data.cpp that will generate random messages,
encrypt them using RSA, as well as measure the time it took to sign the messages. This
information will then be written to a file called data.csv.
1
Modular arithmetic (otherwise known as clock arithmetic) is when the number system is always bound
between 0 and N-1, which is done by the addition of: mod N
2
Section 7 of paper 1.2 of this week’s reading
Iterative Guessing
The attack can be treated as a signal detection problem, The “signal” consists of the timing
variation due to the target exponent bit, and “noise” results from measurement inaccuracies and
timing variations due to unknown exponent bits. The properties of the signal and noise
determine the number of timing measurements required for the attack. The more measurements
are taken, the more noise is “smoothed out”, the data set that you have access to is 10,000
entries long.
Since the exponentiation operation happens with the bits right-to-left, an initial key bit is set to 1
and the attacker would attempt to guess the second key bit. The way this is done happens in
the following steps:
1. Encrypt all of the sample messages with the currently derived key3
2. Check to see whether or not a subtraction operation took place in the final iteration of the
loop that is responsible for performing the modular exponentiation for the final bit
a. If a subtraction occurred, place that message in one bucket/bin
b. If a subtraction did not occur, place that message in a separate bin
3. From the initial data set, cross reference all the messages where a subtraction did occur
and calculate the average time taken for the signature to be computed
4. Do the same for the other set where a subtraction did not occur
5. Calculate the difference between the two averages and check to see if it exceeds a
certain threshold.
a. If it does, the next key bit guess is 1
b. If not, the next key bit guess is 0
6. Check to see if the derived key is the correct one; this can be done by encrypting a
message from the data set and seeing whether or not the signature matches that which
is written in the data set.
7. If the key isn’t fully constructed, repeat from step 1, with a slightly larger subkey.
It is important to note that the threshold you will be using to compare is system specific,
meaning that the value will depend on the processor speed and the amount of noise in the
background.
Tasks
Task 0
Clone this repository and build the project. This can be done by doing the following:
3
The first key guess would be 1
Task 1
Generate the data set by building the project and running the csv binary in the . The arguments
passed would determine the possible key values. The keys will be printed out on the console as
well as saved in the data.csv file. The csv binary takes the following arguments:
- p: 103
- q: 97
- e: 31
- 10,000
The binary needs a while to run, once it is done, it will create the data.csv file within the same
directory. The binary will also output the total time taken for all signatures, make a note of this
number. Afterwards, you must move the data.csv file to somewhere else in order for the script
to work properly:
- navigate out of the build directory and into the Attack/output/ directory
- create a folder inside.
- move the data.csv file from the build directory to your newly created folder
It would be better to SSH into your computer to limit the noise from VNCviewer, it would also be
better if all other applications on the computer are turned off when the data is being generated.
If the data is still too noisy (unlikely as that may be), you can use the pregenerated data in the
output directory.
4
Refer to the instructor if you cannot find the IP
5
If you are using windows, you might want to download WSL (windows subsystem for linux) or PuTTY.
- You will then be logged on as the nyuad user and your terminal will be initially placed in
your home directory.
Task 2
Run the attack script and find the correct cut off value. To run the script:
Submit a text file named duration.txt this file should contain the value of T that yielded the
correct key guess.
Task 3
Repeat task 2 but this time, rerun the csv binary using much larger p and q values. You can
decide what two prime numbers to use, but they both have to be at least 10 digits6. This is more
true to how RSA is typically used as factoring very large numbers becomes very computationally
expensive.
Report any interesting findings when running the experiments. You can look out for the following
things:
1. Did the data generation take longer? How much longer (the reported time is measured in
nanoseconds).
2. Were you able to find the key using only 10,000 samples? If so:
a. Was the threshold value found similar to the one in the first experiment?
b. Did you need to do more iterations before finding the correct threshold value?
3. If not:
a. Why do you think you weren’t able to?
6
typical p and q values are 512-bit integers so even a 10-bit integer is inadequate for security purposes.