Q2 - Physically Unclonable Functions (2018)
Q2 - Physically Unclonable Functions (2018)
Physically Unclonable
Functions
From Basic Design Principles
to Advanced Hardware Security
Applications
123
Basel Halak
Southampton
UK
Computing devices are increasingly forming an integral part of our daily life; this
trend is driven by the proliferation of the Internet of things (IoT) technology. By
2025, it is anticipated that the IoT paradigm will encompass approximately 25
billion connected devices. The interconnection of such systems provides the ability
to collect huge amounts of data which are then processed and analysed for further
useful actions. Applications of this technology include personal health monitoring
devices, the smart home appliances, smartphones, environmental monitoring sys-
tems and critical infrastructure (e.g. power grids, transportation system and water
pipes).
The pervasive nature of this technology means we will soon be finding com-
puting devices everywhere around us, in our factories, homes, cars and even in our
bodies in the form of medical implants. A significant proportion of these devices
will be storing sensitive data. Therefore, it is very important to ensure that such
devices are safe and trustworthy, however; this is not an easy task, and there still
many challenges ahead.
First, there is a rising risk of hardware Trojans insertion because of the inter-
national and distributed nature of the integrated circuits production business.
Second, security attacks, especially those that require physical access to the device
under attack, are becoming more feasible given the pervasive nature of IoT tech-
nology. Third, most of IoT devices are considered to be resource-constrained
systems, which makes it prohibitively expensive to implement classic cryptographic
algorithms; in those cases, a cheaper and more energy-efficient solution is required.
Physically unclonable functions (PUFs) are a class of novel hardware security
primitives that promise a paradigm shift in many security applications; their rela-
tively simple architectures can answer many of the above security challenges. These
functions are constructed to exploit the intrinsic variations in the integrated circuit
fabrication process in order to give each silicon chip a unique identifier, in other
words, a hardware-based fingerprint.
The versatile security applications of the PUF technology mean that an
increasing number of people must understand how it works and how it can be used
in practice. This book addresses this issue by providing a comprehensive
introduction on the design, evaluation metrics and security applications of physi-
cally unclonable functions. It is written to be easily accessible by both students and
engineering practitioners.
Chapter 1 of this book gives a summary of the existing security attacks and
explains the principles of the cryptographic primitives used as the building blocks of
security defence mechanisms; it then introduces the physically unclonable function
(PUF) technology and outlines its applications. Chapter 2 explains the origin of
physical disorder in integrated circuits and explains how this phenomenon can be
used to construct different architectures of silicon-based PUFs; it also outlines the
metrics used to evaluate a PUF design and gives the reader an insight into the design
and implementation of PUF on configurable hardware platforms. Chapter 3 explains
the physical origins of the major reliability issues affecting CMOS technology and
discusses how these issues can affect the usability of PUF technology; it also pre-
sents a case study on the evaluation of the impact of CMOS ageing on the quality
metrics of PUF designs. Chapter 4 provides a comprehensive tutorial on the design
and implementation principles of error corrections schemes typically used for reli-
able PUF designs; it also explains in details the state-of-the-art pre-reliability
enhancement processing approaches applied at the chip post-fabrication stage.
Chapter 5 investigates the security of PUF as cryptographic primitives; it discusses
the existing attacks on PUFs and possible countermeasures and, in addition, intro-
duces a number of quality metrics to evaluate the security of PUF designs. Chapter 6
focuses primarily on how to use PUF technology in practice; it explains in detail how
PUF technology can be used to securely generate and store cryptographic keys,
construct hardware-assisted security protocols, design low-cost secure sensor,
develop anti-counterfeit solutions and implement anti-tempter integrated circuits.
This book has many features that make it a unique source for students, engineers
and educators. The topics are introduced in accessible manner and supported with
many examples. The mathematics are explained in detail to make them easy to
understand. Detailed circuit diagrams are provided for many of the PUF architec-
tures to allow reproducibility of the materials. Each chapter includes a list of
worked exercises, which can be an excellent resource for classroom teaching. In
addition, a list of problems is provided at the end of each part; in total, this book
contains more than 80 problems and worked examples. The appendices include
exemplar digital design of PUF written in system Verilog and a list of MATLAB
scripts used in this book to characterise PUF quality metrics. The detailed examples
of PUF applications in Chap. 6 can be an excellent source for course projects. Each
chapter ends with a conclusion, which summarises the important lessons and out-
lines the outstanding research problems in the related areas. The book has a large
number of references which give plenty of materials for further reading.
How to Use this Book
The material of this book has evolved over many years of working on this topic in
research and teaching. From our experience, one can teach most of the book
contents in a one semester course, which includes 36 one-hour sessions. Some
of the book chapters can be also taught separately as short courses or part of other
modules. Here are a couple of examples
I hope you enjoy reading this book as much as I enjoyed writing it.
1.1 Introduction
It was 2 a.m. in the morning, when Michael and Jane heard a man’s voice, ‘Wake up
baby, Wake up…’. The couple run to their 2-year-old’ daughter room. The baby was
fast asleep, but the web-enabled baby monitor was moving; this was a device Michael
and Jane had bought a year earlier to make it easier for them to check up on their
daughter using an app on their smartphones. The couple then discovered that a hacker
had managed to gain control of the device and was using it to watch and harass their
child. One year later, a mother from New York City was horrified when she dis-
covered by accident that a hacker was spying on her 3-year-old child using a similar
device. One can only imagine the parents’ shock when faced with such an unfortunate
incident. These were not rare events; in fact, in the United States alone there were
more than seven similar stories reported in the media between 2013 and 2015.
A more serious security breach took place in 2015, when a pair of hackers
demonstrated how they could remotely hack into the 2014 Jeep Cherokee; what is
more, they showed how they could take control of the vehicle and completely
paralyse it whilst driven on the motorway. Later on, during the Black Hat con-
ference the two researchers (Charlie Miller and Chris Valasek) explained in details
how they had achieved this. They started by hacking into the multimedia system of
the Jeep through the Wi-Fi connection; it emerged that it is not very difficult to do
this because the Wi-Fi password is generated based on the time when the vehicle
and its multimedia system are turned on for the first time. Although such approach
of password generation is deemed to be secure, a hacker who manages to know the
year the car was manufactured and guess the month it was first used can reduce the
possible Wi-Fi passwords to around 15 million combinations, a small number from
a hacker’s perspective as it allows him to find the right password in few hours using
a brute force search algorithm.
2 1 A Primer on Cryptographic Primitives and Security Attacks
The above stories provide some useful insights into the current security chal-
lenges facing electronics systems’ designers.
First of all, there is an increasing number of computing devices forming an integral
part of our daily lives; this trend is driven by the proliferation of the Internet of things
(IoT) technology, which is a network of physical objects connected to the Internet
infrastructure to perform tasks without human interaction. These objects are typically
embedded with electronics, software and sensors; this enables them to collect and
exchange data. The network connectivity allows these devices to be operated and
controlled remotely. The IoT technology is expected to be used in a wide variety of
applications, such as personal health monitoring devices, smart home appliances,
smart cars, environmental monitoring systems and critical infrastructure (e.g. power
grids, transportation systems and water pipes) [1]. By 2020, it is anticipated that the
IoT paradigm will include approximately 20 billion connected devices.
Second, there is an increasing reliance on portable hardware devices to carry out
more security-sensitive tasks. The most prominent examples are the smartphones,
enhanced with multitudes of sophisticated applications; these devices form an
integral part of modern life. They are currently being used for socialising with
friends (e.g. Facebook), finding a partner (e.g. Dating apps), shopping, streaming
movies, gambling, carrying out bank transactions and doing business. In fact, we
can safely assume that a mobile device contains more information on their owner
than what their most intimate partner will ever know.
Furthermore, there are an increasing number of emerging applications which
have stringent hardware security requirements. For example, mobile payment,
biometric access control mechanisms and subscribed TV channels.
Designing secure systems, however, is not an easy task, and the complexity of
computing device is increasing rapidly; in fact, current electronics systems can have
more than one billion transistors. In order to ensure the security of such systems,
one needs to verify the system is doing exactly what is supposed to do nothing more
and nothing less; however, given the complex nature of modern designs and the
large number of possible interactions between different components on the same
chip (processors, memories, buses, etc.), validating the security of a device becomes
an intractable problem.
On the other hand, there are many parties who stand to gain from weakly secured
systems. For example, an adversary who manages to break the security of a smart
card can potentially steal a large amount of money, and a customer who can gain
unauthorised access to a box set to watch paid TV channels can save subscription
fees. What is more, adversaries may have more sinister goals than accumulating
wealth; for example, there is nothing preventing a resourceful adversary from
crushing a weakly secured smart car and killing its passengers; he may not even be
charged with the crime.
1.1 Introduction 3
It is hoped that this chapter will help the reader to develop a good understanding
of the motivation of secure hardware design and how physically unclonable
function fit in this context.
In order to understand the scope and the nature of security attacks on electronics
systems, we will consider a simple scenario that is applicable to a vast majority of
applications. Envisage two communicating parties, Bob and Alice as shown in
Fig. 1.1. Each of them uses an electronic device to send/receive messages over a
communication link.
4 1 A Primer on Cryptographic Primitives and Security Attacks
Envisage an adversary Eve who would like to spy on Bob and Alice; in prin-
ciple, Eve can try to compromise one of the three components of such a system,
namely, the communication link, the software running on Bob’s or Alice’s elec-
tronics devices or their respective hardware circuitry.
One can argue that Bob and Alice are also part of this system so they can also be
a target for Eve’s attack. Although this is true, this discussion is limited to the threat
against electronics systems and does not include social engineering attacks.
The remainder of this section gives examples of the known security attacks on
each of the above-stated components.
These attacks aim to maliciously modify or tamper with the physical implemen-
tation of a computing device. They generally require direct access to the hardware
or the design files, but can sometimes be carried out remotely as the case with the
Stuxnet attack described above, wherein a manipulation of the software controlling
the centrifuges led to physically damaging them; this is referred to as cyber-physical
attacks.
In principles, hardware attacks can take place at any time during the life cycle of
the computing device; based on this, they can be classified into two categories:
(a) Attacks during design and fabrication: An example of this type is Trojan
insertion, which consists of adding extra circuitry that has malicious func-
tionality into a design (e.g. a kill switch, a time bomb, etc.) [4]. Overpro-
duction of integrated circuits is another example, wherein a malicious
fabrication facility produces more chips than required, these are subsequently
sold in the black market; this is referred to as IC counterfeiting [5].
6 1 A Primer on Cryptographic Primitives and Security Attacks
(b) Post-fabrication attacks (also referred to as physical attacks) take place after
the device is put in operation. This type can be further classified into three
categories:
• Invasive attacks that require access to the internal structure of the inte-
grated circuits: an example of this type is reverse engineering attacks which
aim to steal the intellectual property of a design.
• Non-invasive physical attacks wherein an adversary interacts with the
hardware externally: one example of this type is side-channel analysis
wherein an attacker analyses the power consumption of the electromag-
netic noise of a device in order to deduce sensitive information [6–8];
another example is data remanence, wherein one can retrieve information
stored in a device memory even if they have been deleted [9].
• Semi-invasive attacks which require access to the surface of a device but
not the internal logic: a prime example of this type is optical fault injection
[10], wherein illuminating a device can cause some of its transistors to
conduct current, which may trigger an error.
Protection Costs
These are encryption algorithms, but unlike symmetric ciphers, they do not use the
same key for the encryption and decryption processes; instead, each user has a
public key that can be made available widely and a private key only known to them.
If a sender ‘Bob’ wants to transmit an encrypted message to a receiver ‘Alice’, he
encrypts his messages using Alice’s public key, normally obtained from a trusted
third party. Only Alice would be able to decipher Bob’s messages using her private
key.
The security of asymmetric cipher is based on the difficulty in solving a math-
ematical problem such as factorising a large integer number or computing discrete
logarithms. Such problems are prohibitively complex to solve unless one has access
to additional information, which is kept secret and used for decryption by authentic
receivers (i.e. the private key).
One advantage of asymmetries ciphers compared to their symmetric counterparts
is that they do not need a prior exchange of keys. On the other hand, these algo-
rithms typically require more computation resources.
In practice, asymmetric ciphers are used to construct key exchange protocols to
help communicating parties to agree on an encryption key; subsequently, a sym-
metric cipher is used for data encryption.
This primitive maps arbitrary length inputs to a short fixed-length output (digest).
A hash function should satisfy a number of requirements to be suitable for security
applications. First of all, it should be deterministic (the same digest is obtained if
the input is the same). Second, it should be extremely hard to regenerate a message
from its hash value (i.e. digest). Third, it should be infeasible to find two messages
which have the same hash value. Fourth, it should have a large avalanche effect,
which means a small change in its inputs lead to a significant change on the output;
this makes it harder for an adversary to build a correlation between the messages
and their digests.
A classical application for this primitive is the secure storage of passwords file,
wherein instead of storing the users’ passwords as clear texts (i.e. unencrypted),
which is a significant vulnerability, one can store the digest of each password; this
makes it harder for an adversary to obtain the stored passwords if the server is
compromised. In this case, to authenticate a user, a server recreates a hash value of
the password presented at the time of authentication and then compares it with a
previously stored hash. Keyed hash functions can be built using symmetric ciphers.
One-way hash functions are also used to construct messages authentication
codes such as HMAC (Hash-based message authentication code) [13].
1.4 Cryptographic Primitives 9
These primitives are mainly used to generate a nonce, which is an arbitrary number
used only once. Nonce is employed as an initialization vector for encryption
algorithms and in authentication protocols to prevent replay attacks.
An oblivious transfer (OT) protocol in its simplest form enables a sender to transfer
one or multiple data items to a receiver while remaining oblivious to what pieces of
information have been sent (if any). One form of this scheme is called 1-of-2
oblivious [14]; it allows one party (Bob) to retrieve one of the two possible pieces
of information from another party (Alice), such that Bob does not gain any
knowledge on the piece of data he has not retrieved nor Alice establishes which of
the two data items she holds has been transferred to Bob. The 1-of-2 oblivious
transfer has been later generalised to k-of-n OT [15], which can be used to construct
secure multiparty computation schemes.
The security of the physical layer (i.e. the Hardwar) of electronics systems has been
gaining an increasing attention due to a number of issues summarised below:
10 1 A Primer on Cryptographic Primitives and Security Attacks
This is due to the rise of Internet of things technology, which consists of con-
necting billions of computing devices to the Internet, such devices can include
vehicles, home appliances, industrial sensors, body implants, baby monitors and
literary any physical object which has sufficient capability to communicate over the
Internet [20]. The pervasive nature of this technology makes it easier for an ad-
versary to get hold of a device and carry out well-established physical attacks to
extract sensitive data, inject a fault or reverse engineer its design.
Conventional examples of secure hardware tokens include smart cards and staff
cards. In recent years, these tokens are increasingly relied upon in cryptographic
operations for a range of security-sensitive applications. For example, numerous
European countries have implemented electronics identity systems used for services
such as tax payment and retirement funds management. Examples of such devices
are ID-porten in Norway and Telia ID in Sweden.
Another usage of secure hardware token is secure financial transactions; these
days the majority of financial institutions offer their clients the ability to access their
account online and carry out various tasks such as balance checking, money transfer
and setting direct debit. To enhance the security of remote accounts management,
extra measures are needed which go beyond the typical login/password approach.
To meet such requirements, many banks have started giving their customers
1.5 Why We Need Hardware Security? 11
hardware devices, which are used to confirm their identities and sometimes to
generate a digital signature to confirm the details of a transaction they are trying to
make (amount, beneficiary name, etc.).
A third application area for secure hardware tokens is secure multiparty com-
putation, wherein a number of parties need to carry out joint communication based
on their private individual inputs; this type of computation is used in private data
mining, electronic voting and anonymous transactions. Protocols that meet the
security requirements of such applications do not typically have efficient imple-
mentation in practice [21]. This has given rise to hardware-assisted cryptographic
protocols. The latter rely on the use of tamper-proof hardware tokens to help
achieve strong security guarantees as set in the Canetti’s universal composition
(UC) framework [22]. In this type of protocols, the trust between communicating
parties is established through the exchange of trusted hardware tokens.
The fabrication processes of physical objects can sometimes have some limitations,
which make it difficult to have exact control of the devices being manufactured; this
leads to slight variations in the dimensions of the resulting products. This limitation
is especially true in the fabrication process of semiconductor devices in advanced
technology nodes (below 90 nm) [23–25]. These intrinsic process variations lead to
fluctuations in the transistors devices’ length, width, oxide thicknesses and doping
levels. This makes it impossible to create two devices which are identical. This
means the same transistor fabricated on different devices may have slightly different
electrical characteristics.
A physically unclonable function (PUF) exploits these inherent process varia-
tions to generate a unique identifier for each hardware device.
A PUF can be defined as a physical entity whose behaviour is a function of its
structure and the intrinsic variation of its manufacturing process. This means two
PUF devices will have two distinct input/output behaviours even if they have
identical structures because of process variations [26]. PUFs can be realised using
integrated circuits, in which case, they are referred to as silicon-based PUFs. A ring
oscillator is the simplest example of a PUF as it generates a distinct frequency for
each chip it is implemented on.
PUFs are considered to be cryptographic primitives and can be used as a basic
building block to construct security protocols and design secure systems. This
primitive differs from those described in Sect. 1.4 in one crucial aspect, that is, the
security of PUFs is based on the difficulty of replicating their exact physical
structure rather than on the difficulty of solving a mathematical problem (e.g.
factorising large integer number as the case of RSA asymmetric ciphers).
PUFs have a number of security-related applications, some of which are already
being integrated into commercial products and the rest are still under development.
We will briefly summarise below their current applications:
12 1 A Primer on Cryptographic Primitives and Security Attacks
The relatively simple structure of PUF design makes it an attractive choice for
low-cost authentication schemes, wherein the unique challenge/response behaviour
of a PUF is used as an identity of physical objects. This is especially useful for
recourse-constrained systems such as Internet of things devices which cannot afford
classic security solutions [20].
conditions. To do this, the correlation between the PUF responses and the physical
quantities being measured should be characterised before the sensor is deployed.
This approach makes it feasible to construct low-cost secure remote sensing
schemes as it removes the need for implementing a separate encryption block on the
device, because the readings generated by PUF cannot be understood by an ad-
versary who has not characterised the behaviour PUF circuits; there are several
examples of PUF-based secure sensing schemes proposed in the literature [29–31].
Currently, there are a number of companies driving the development of
PUF-based security solution in the above application areas, and they are also
exploring other usages [32–34].
1.7 Conclusions
1.8 Problems
1. What are the main security attacks on electronics systems?
2. What are the main factors a designer needs to consider when developing
security defence mechanisms?
3. Why is it challenging to validate the security of an electronic system such that
exists in smart mobile phones?
4. What is the main requirement a hash function needs to satisfy in order to be
suitable for security-sensitive applications?
5. What is a hardware Trojan?
6. Name three malicious functionalities a hardware Trojan may perform.
7. What is a physically unclonable function?
8. Explain the difference between physically unclonable functions and other
cryptography primitives such as symmetric ciphers.
14 1 A Primer on Cryptographic Primitives and Security Attacks
References
1. A. Zanella, N. Bui, A. Castellani, L. Vangelista, M. Zorzi, Internet of things for smart cities.
IEEE Internet Things J. 1, 22–32 (2014)
2. K. Nohl, J. Lell, BadUSB: on accessories that turn evil, Security Research Labs. Black
Hat USA Presentation (2014)
3. R. Poroshyn, Stuxnet: The True Story of Hunt and Evolution (Createspace Independent Pub,
2014)
4. M. Tehranipoor, F. Koushanfar, A survey of hardware Trojan taxonomy and detection. IEEE
Des. Test Comput. 27, 10–25 (2010)
5. M. Rostami, F. Koushanfar, R. Karri, A primer on hardware security: models, methods, and
metrics. Proc. IEEE 102, 1283–1295 (2014)
6. B. Halak, J. Murphy, A. Yakovlev, Power balanced circuits for leakage-power-attacks
resilient design. Sci. Inf. Conf. (SAI) 2015, 1178–1183 (2015)
7. C. Clavier, J.S. Coron, N. Dabbous, Differential power analysis in the presence of hardware
countermeasures, in Proceedings of the Second International Workshop on Cryptographic
Hardware and Embedded Systems, vol. 1965 LNCS (2000), pp. 252–263
8. M.L. Akkar, Power analysis, what is now possible, in ASIACRYPT (2000)
9. S. Skorobogatov, Data remanence in flash memory devices, in Presented at the Proceedings
of the 7th International Conference on Cryptographic Hardware and Embedded Systems
(Edinburgh, UK, 2005)
10. S.P. Skorobogatov, R.J. Anderson, Optical fault induction attacks, in Cryptographic
Hardware and Embedded Systems—CHES 2002: 4th International Workshop Redwood
Shores, CA, USA, August 13–15, 2002 Revised Papers, ed. by B.S. Kaliski, ç.K. Koç, C. Paar
(Springer Berlin Heidelberg, Berlin, Heidelberg, 2003), pp. 2–12
11. E.F. Foundation, Cracking DES: Secrets of Encryption Research, Wiretap Politics & Chip
Design (Electronic Frontier Foundation, 1998)
12. J. Daemen, V. Rijmen, The Design of Rijndael: AES—The Advanced Encryption Standard
(Springer Berlin Heidelberg, 2013)
13. D.R. Stinson, Universal hashing and authentication codes, in Advances in Cryptology—
CRYPTO ’91: Proceedings, ed. by J. Feigenbaum (Springer Berlin Heidelberg, Berlin,
Heidelberg, 1992), pp. 74–85
14. S. Even, O. Goldreich, A. Lempel, A randomized protocol for signing contracts. Commun.
ACM 28, 637–647 (1985)
15. C.-K. Chu, W.-G. Tzeng, Efficient k-Out-of-n oblivious transfer schemes with adaptive and
non-adaptive queries, in Public Key Cryptography—PKC 2005: 8th International Workshop
on Theory and Practice in Public Key Cryptography, Les Diablerets, Switzerland, January
23–26, 2005. Proceedings, ed. by S. Vaudenay (Springer Berlin Heidelberg, Berlin,
Heidelberg, 2005), pp. 172–183
16. M. Backes, A. Kate, A. Patra, Computational verifiable secret sharing revisited, in Advances
in Cryptology—ASIACRYPT 2011: 17th International Conference on the Theory and
Application of Cryptology and Information Security, Seoul, South Korea, December 4–8,
2011. Proceedings, ed. by D.H. Lee, X. Wang (Springer Berlin Heidelberg, Berlin,
Heidelberg, 2011), pp. 590–609
17. T. Eccles, B. Halak, A secure and private billing protocol for smart metering, in IACR
Cryptology ePrint Archive, vol. 2017 (2017), p. 654
References 15
18. S. Adee, The hunt for the kill switch. IEEE Spectr. 45, 34–39 (2008)
19. S. Mitra. (2015, January 2) Stopping hardware Trojans in their tracks. IEEE Spectr.
20. W. Trappe, R. Howard, R.S. Moore, Low-energy security: limits and opportunities in the
internet of things. IEEE Secur. Priv. 13, 14–21 (2015)
21. C. Hazay, Y. Lindell, Constructions of truly practical secure protocols using standard
smartcards, in Presented at the Proceedings of the 15th ACM Conference on Computer and
Communications Security (Alexandria, Virginia, USA, 2008)
22. R. Canetti, Universally composable security: a new paradigm for cryptographic protocols, in
Presented at the Proceedings of the 42nd IEEE Symposium on Foundations of Computer
Science (2001)
23. B. Halak, S. Shedabale, H. Ramakrishnan, A. Yakovlev, G. Russell, The impact of variability
on the reliability of long on-chip interconnect in the presence of crosstalk, in International
Workshop on System-Level Interconnect Prediction (2008), pp. 65–72
24. D.J. Frank, R. Puri, D. Toma, Design and CAD challenges in 45 nm CMOS and beyond, in
IEEE/ACM International Conference on Computer-Aided Design (2006), pp. 329–333
25. C. Alexander, G. Roy, A. Asenov, Random-dopant-induced drain current variation in
nano-MOSFETs: a three-dimensional self-consistent Monte Carlo simulation study using
(Ab initio) ionized impurity scattering. Electron Devices, IEEE Trans. 55, 3251–3258 (2008)
26. L. Daihyun, J.W. Lee, B. Gassend, G.E. Suh, Mv Dijk, S. Devadas, Extracting secret keys
from integrated circuits. IEEE Trans. Very Large Scale Integr. VLSI Syst. 13, 1200–1205
(2005)
27. A. Yousra, K. Farinaz, P. Miodrag, Remote activation of ICs for piracy prevention and digital
right management. IEEE/ACM Int. Conf. Comput.-Aided Design 2007, 674–677 (2007)
28. U. Rührmair, Oblivious transfer based on physical unclonable functions, in Trust and
Trustworthy Computing: Third International Conference, TRUST 2010, Berlin, Germany,
June 21–23, 2010. Proceedings, ed. by A. Acquisti, S.W. Smith, A.-R. Sadeghi (Springer
Berlin Heidelberg, Berlin, Heidelberg, 2010), pp. 430–440
29. Y.G.H. Ma, O. Kavehei, D.C. Ranasinghe, A PUF sensor: securing physical measurements, in
IEEE International Conference on Pervasive Computing and Communications Workshops
(PerCom Workshops) (Kona, HI, 2017), pp. 648–653
30. K. Rosenfeld, E. Gavas, R. Karri, Sensor physical unclonable functions, in IEEE
International Symposium on Hardware-Oriented Security and Trust (HOST) (Anaheim,
CA, 2010), pp. 112–117
31. H.M.Y. Gao, D. Abbott, S.F. Al-Sarawi, PUF sensor: exploiting PUF unreliability for secure
wireless sensing. IEEE Trans. Circuits Syst. I Regul. Pap. 64, 2532–2543 (2017)
32. Intrinsic-Id. (2017). Available: http://www.intrinsicid.com/products/
33. Verayo. (2017). Available: http://verayo.com/tech.php
34. Coherentlogix. (2017). Available: https://www.coherentlogix.com/products/hyperx-processors/
security/
Physically Unclonable Functions:
Design Principles and Evaluation 2
Metrics
2.1 Introduction
3. Introduce different architectures of silicon based PUF and show how these can
be constructed.
4. Outline the metrics used to evaluate a PUF designs.
5. Give the reader an insight into the design and implementation of PUF on
configurable hardware platforms.
It is hoped that this chapter will give the reader the necessary skills and
background to be able to construct their own PUF devices and evaluate their
metrics.
The organisation of this chapter is as follows, Sect. 2.3 outlines the concept of
physical disorder, in Sect. 2.4, we look more closely into the conditions under
which integrated circuits can exhibit forms of physical disorder. Section 2.5
presents a generic framework to design a PUF device using integrated circuit
design techniques. Examples of existing designs are discussed in a great depth in
Sects. 2.6, 2.7 and 2.8 respectively. Section 2.9 summarises the important
metrics employed to assess the quality and usability of PUF circuit architectures.
Section 2.10 discusses in details the design and implementation processes of a
configurable PUF architecture using filed programmable logic arrays (FPGAs).
A comprehensive comparison of the characteristics of publically available
ASCI implementation of PUFs is presented in Sect. 2.11. Learned lessons are
summarised in Sect. 2.12. Finally, problems and exercises are included in
Sect. 2.13.
Fig. 2.1 Examples of physical disorder: a rose petal, b paper c a coffee bean d a tooth
we would find such disorder in abundance, Fig. 2.1d shows a microscopic image of
a tooth, it may not be pretty, but it is certainly irregular.
More importantly, physical disorder can be found in abundance in modern
integrated circuits, this is because the continuous scaling of the semiconductor
technologies has made it extremely difficult to fabricate precisely sized devises, for
example, Fig. 2.2 shows the irregular structure of the metal conductors in a
semiconductor chip fabricated in 90 nm technology.
This physical disorder is unique for each device, it is also hard to replicate,
therefore it can be used to give each physical object an identity.
20 2 Physically Unclonable Functions: Design Principles …
this chapter. Physical sources of variability are mainly due to the fact that the
achievement of parameter precision becomes exponentially more difficult, as
technology scales down, due to the limitations imposed by quantum mechanics
[8, 9], these factors can be attributed to different sources, namely: [8, 10–14].
The first set of process variations relate to the physical geometric structure of
MOSFET and other devices (resistors, capacitors) in the circuit. These typically
include:
(a) Film thickness variations: The gate oxide thickness (Tox) is a critical but
usually relatively well controlled parameter. Variation tends to occur primarily
from one wafer to another with good across wafer and across die control.
(b) Lateral dimension variations: Lateral dimensions (channel length, channel
width) typically arise due to photolithography proximity effects or plasma etch
dependencies. MOSFET are well known to be particularly sensitive to effec-
tive channel length (Leff), as well as gate oxide thickness and to some degree
the channel width. Of these, channel length variation often is singled out for
particular attention, due to the direct impact such variation can have on device
output current characteristics [11].
(a) Doping variations are due to dose, energy, angle, or other ion implant
dependencies. Depending on the gate technology used, these deviations can
lead to some loss in the matching of NMOS versus PMOS devices even in the
case where within wafer and within die variations are very small.
(b) Deposition and Anneal: Additional material parameters deviations are
observed in silicide formation, and in the grain structure of poly or metal lines.
These variations may depend on the deposition and anneal processes. These
material parameters’ deviations can contribute to appreciable contact and line
resistance variation.
22 2 Physically Unclonable Functions: Design Principles …
(a) Metal resistivity (q): While metal resistivity variation can occur (and include a
small random element), resistivity usually varies appreciably on a wafer to
wafer basis and is usually well controlled.
(b) Dielectric constant (e) may vary depending on the deposition process, but is
usually well controlled.
(c) Contact and via resistance: Contact and via resistance, can be sensitive
to etch and clean processes, with substantial wafer-to-wafer and random
components.
Fig. 2.3 The impact of variability on the electrical parameters of VLSI circuits
Given the above two requirements, one can develop a generic architecture for
silicon PUF circuits as shown in Fig. 2.5.
By comparing Figs. 2.4 and 2.5, we can deduce that the ring oscillator is a
transformation block and the frequency-measuring block is a conversion block. The
former turns the challenge (the enable signal) and the process variation of the
implementation technology (var) into a measurable quantity (i.e. the oscillation
frequency that is the inverse of delay), and the latter turns the measured frequency
into a binary value (i.e. the response).
Measurable
Quanty
Challenge(c) (V,D ,I) Response (r)
Transformaon Conversion
Block Block
This generic representation of the PUF architecture makes it easier to explore the
design space and re-use existing circuitry; this is because the design problem of the
PUF can now be thought of as designing two separate modules: a transformation
block and a conversion circuit, examples of both of which are widely available in
the literature. Examples of transformation blocks include digital to analogue con-
verters, ring oscillators, and current sources. Conversion circuits may include
analogue to digital converters, time to digital converters, phase decoding circuitry
[17] and comparators [18, 19].
The generic architecture presented in Fig. 2.5, gives us an intuitive way to
classify existing implementation of silicon-based PUF devices based on the mea-
surable quantity as follows, voltage-based, current-based and delay-based PUFs.
It should be noted that that there exist PUF constructions which do not fall
entirely under any of these three categories, such as those which exploit variations
in both driving current and threshold voltages. Nevertheless, this categorization
provides an instinctive method to approach the design problem of PUFs.
In the next section, we will review in details existing PUF circuits and discuss
their different implementations.
These structures transform process variations into a measurable delay figure, and
the latter is converted into a binary response. Initial designs of PUF circuits have all
been based on this structure, there are many examples of such constructions in the
literature, including arbiter-based circuits [1, 20–22], ring oscillator designs [23–31]
and those based on asynchronous structure [31]. We will discuss examples of these
constructions in the following subsections.
Let us consider the arbiter-based structure shown in Fig. 2.6; it consists of two
digital paths with identical nominal delay and an arbiter. When an input signal is
applied, it will propagate through these two paths, and arrive to the inputs of the
arbiter at slightly different moments because the intra-die variability, the arbiter
outputs a logic “1” or a logic “0” depending on which path wins the race.
When such structure is implemented on various chips, the response will vary
because of inter-die variations, in other words, it is unique to each implementation,
therefore, it can be used as a hardware signature.
The arbiter circuit is normally designed using a Set-Reset latch; where in the
latch is constructed using two cross-coupled gates as shown in Fig. 2.7. The
operation principles of this circuit are as follows; when both inputs are low, the
output will also be low, if only one input goes high, the output will change
accordingly and become locked such that all changes on the other input do not
affect the status of the output. In this example, if the signal on inputs (In 1) goes
high, then the output will go high and remain high, if the signal on input (In 2) goes
high, then the output will remain low. If the signals on both inputs go high within a
short period of time of each other’s, the output will assume a value based on the
signal that arrives first (i.e. “1” if (In 1) is first to arrive, “0” if (In 2) is first to
arrive). However; if both signals arrive within a very short period, the output may
enter a metastable state for an indefinite amount of time. More precisely until the
output of the top gate (G1) crosses the threshold voltage of the bottom gate (G2), in
which case the arbiter output becomes high, it is also possible that the output of
bottom gate (G2) crosses the threshold voltage of the top gate first, in which case
the output of the arbiter remains low indefinitely.
Assuming metastability happens at (t = 0), the output voltage of (G1) at time
(t) will be given as [32]:
t
VðtÞ ¼ V0 es ð2:1Þ
where:
Ʈ is a technology parameter that depends only on the circuits characteristics.
Metastability ends approximately when the output voltage of the two gates (G1
or G2) reaches the threshold voltage of the NAND gate, by substituting V with Vth
in the previous equation we get:
28 2 Physically Unclonable Functions: Design Principles …
Vth
t ¼ s ln ð2:2Þ
V0
CRP ¼ 2k ð2:4Þ
The structure proposed of the arbiter in the above design was based on a
transparent latch, however; as reported by the authors, the asymmetric nature of
such a latch can greatly affect the predictability of the response, more specifically,
Fig. 2.8 Structure of a two bit challenge arbiter PUF: a the challenge is “01” b the challenge is
“00”
2.6 Delay-Based PUFs 29
the arbiter tends to always favour one path over the other, which meant that 90% of
the responses are “0” [1]. Proposed techniques to tackle this problem, by system-
atically changing the delay of logic paths, may reduce the effects of this problem
but will make the behaviour of the arbiter PUF inherently more predictable, because
they are not based on the physical disorder but on designed variations.
The difficulty in achieving a perfectly symmetric design is greatly increased by
the fact that layouts of the vast majority of modern chips are produced using
automated place and route software tools (e.g., designs based on standard cell logic
libraries or those placed on configurable hardware platforms). In these cases, the
degree of freedom a designer has to control the layout is reduced.
Therefore, manual layout tools need to be used for designing arbiter PUFs to
ensure the symmetry of their delay paths.
The original ring oscillator based PUF is made of two multiplexers, two counters,
one comparator and K ring oscillators [33]. Each ring oscillates at a unique fre-
quency depending on the characteristics of each of its inverters, the two multi-
plexers select two ROs to compare. The two counter blocks count the number of
oscillations of each of the two ROs in a fixed time interval. At the end of the
interval, the outputs of the two counters are compared, and depending on which of
the two counters has the highest value, the output of the PUF is set to 0 or 1.
A block diagram of its structure is shown in Fig. 2.9 [33].
Similarly to the arbiter PUF, this design still requires the ring oscillator to have
identical nominal delays, however it removes the need for an arbitration circuit,
therefore overcoming the problem of metastability, which allows for higher
reliability.
The maximum number of challenge/response pairs (CRP) obtained from an
RO PUF with (k) ring oscillators, is given as below:
k:ðk 1Þ
CRP ¼ ð2:5Þ
2
In comparison with arbiter-based designs, RO PUFs are larger and consume more
power but they are less susceptible to metastability-induced errors. Both of these
designs are vulnerable to machine learning attacks as will be seen later in Chap. 5.
This is another variation of delay based PUFs that has been proposed recently by
Murphy et al. in [34], it has the same structure as the generic ring oscillator shown
in Fig. 2.9, however it uses self-timed rings instead of the inverter chains. The
design is based on the use of the Muller’s C-element, a fundamental building block
of asynchronous circuits. Figure 2.10 shows a CMOS implementation of the
C-element based on the use of weakly inverted output, its operation principles are
simple, basically the output will be set to a logic “1” or “0” when both inputs
assume the logic “1” or logic “0” values respectively, otherwise the output retains
its previous value (i.e. remains unchanged).
An example of a three stage self-timed ring (STR) is shown in Fig. 2.11. Each
stage consists of a C-element and an inverter connected to one input of the
C-element, which is marked R (Reverse), the other input is connected to the pre-
vious stage and marked F (Forward).
Fig. 2.10 The muller’s C-element: a graphic representation b CMOS implementation c truth
table
To aid the understanding of the behaviour of this structure, we will introduce the
concept of Bubbles and Tokens. A stage in the STR is said to contain a “Bubble” if
its output is equal to the output of the previous stage. A stage in the STR is said to
contain a “Token” if its output is different from the output of the previous stage.
Unlike inverter chains, self-timed rings do not oscillate unless the following
conditions are satisfied. First, the number of stages (STR) should be at least three
and second it should be equal to the sum bubbles and tokens contained in all the
stages. These conditions are summarized in the equations below.
STR 3 ð2:6Þ
STR ¼ NB þ NT ð2:7Þ
where
NB 1 is the number of bubbles
NT is a positive even number of tokens
In order to meet such conditions the inputs of the self time rings should be
initialised to the correct values otherwise it will not oscillate (i.e. it remains in
deadlock state).
The main advantage of self-timed rings is that they increase the robustness of the
PUF‘s responses against environmental variations, this comes at the cost of an
increase in the cost of silicon area, moreover, these self-timed structures are prone
to entering deadlock states.
One of the earliest examples of this type of PUFs can be found in [35], where the
authors presented a MOSFET based-architecture. The main motivation behind this
design is to harness the exponential dependency of the current in the sub-threshold
region on threshold voltage (Vth) and gate-to-source voltage (VGS), in order to
increase the unpredictability of the PUF’s behavior. A conceptual architecture of
this design is shown in Fig. 2.12.
32 2 Physically Unclonable Functions: Design Principles …
The operation principles of this design are as follows, the challenge bits are
applied simultaneously to the inputs of two identically sized arrays of transistors,
each challenge selects a number of transistors to be connected to the output of each
array. The two outputs are then compared to generate a binary response. The
number of challenge response pairs for this design is given below.
where k, n are the number of columns and rows in the transistor arrays respectively.
There are a number of ways the transistors arrays can be designed, the proposed
architecture in [35] suffers from a number of shortcomings, including the low level
output voltage of the array, which makes it harder to design a suitable comparator.
An improved version of the arrays ‘design was proposed in [36]’, a simplified block
diagram of the same is shown in Fig. 2.13.
The array in Fig. 2.13 consists of k columns and n rows of a unit cell (as
highlighted in red in Fig. 2.13). Each unit consists of two parts, and each part
consists of two transistors connected in parallel, one of them has a minimally sized
length in order to maximize the variability in its threshold voltage, we call it the
“stochastic” transistor (e.g. N11x). The second transistor acts as a switch, either to
remove the impact of stochastic transistor when it is ON, or to include when it is
OFF (e.g. N11), we call it the “Switch transistor”.
Each bit of the challenges is applied to a NMOS unit cell and its symmetric
PMOS cell; take for example the challenge bit (c11), which is applied to the two
units highlighted in red and green in Fig. 2.13.
If the challenge is logic “1”, the “switch” transistors in the NMOS cell will be
ON and those in the PMOS cell will be OFF. On the other hand, there is always a
contribution from stochastic transistor regardless whether the challenge bit is ‘0’ or
‘1’, because the non-inverted and inverted versions of each input bit are connected
to a stochastic transistor, therefore, regardless of the challenge bit, one out of two
stochastic transistors will be selected and be part of the network. This architecture
is, therefore known as “Two Chooses One” or “TCO”.
The output of each array is dependent on the accumulated current flowing
through the network of transistors.
2.7 Current-Based PUFs 33
The inherent intra-die variations ensure the voltage output of one of the array is
slightly more than the other; a conventional dynamic comparators (e.g.
Op-Amp-based comparators) can capture such a difference.
There are a number of considerations that need to be taken into account when
building a current-based PUF as the one shown above. First, a proper biasing needs
to be applied for Vgn and Vgp to ensure the stochastic transistors always operate in
the sub-threshold region. Second, for the “Switch” transistors, their sub-threshold
current should be negligible, and they should provide small ON-state resistance. As
a rule of thumb, the dimensions (width and length) of the switch transistors should
be 10 times of those of their corresponding stochastic transistors. Finally, the dif-
ference between the output voltages of the two arrays should be large enough for
the comparator to detect it; otherwise, the behavior of the device will be unreliable.
(1) A region of the memory is chosen to produce the PUF response, it is defined
by the starting address and the size
(2) The refresh functionality is disabled for this reserved region
(3) An initial value is written into this region
(4) Access is disabled to all cells in the reserved region for a chosen period of time
(t), during which the charge in each cell decays at a unique rate proportional to
its leakage current
(5) Once the decay time has expired, the contents of the reserved region is read as
normal to produce the response of the PUF
(6) Normal operation is then resumed and the reserved region is made available
again to the operating system.
CRP ¼ R N ð2:9Þ
This type of design can transform process variations into a measurable voltage
figure; we are going to study two examples; the first is based on the use of static
random access memories (SRAM) and the second is based on the use of SR latches.
PUFs based on Static Random Access Memories (SRAM) are one of the earliest
designs to appear in this category in [38], they have been initially proposed to
secure FPGA designs by generating a device-specific encryption key to scramble
the bit streams before storing it in an external memory. Such an approach protects
36 2 Physically Unclonable Functions: Design Principles …
against a threat model where in an adversary is able to extract the bit stream by
analysing the signal on the cable used to load the design, the use of SRAM PUF
prevents such an adversary form re-using the same bit stream to program other
FPGAs.
We are now going to discuss in more details the principles of an SRAM PUF.
Figure 2.15 shows a typical SRAM cell, the latter is composed of two
cross-coupled inverters (P1, N1, P2, N2) and two N-type access transistors (N3,
N4). The two inverters have two stable states logic ‘1’ or logic ‘0’, which can be
accessed through N3 and N4 by two bit lines namely ‘bit’ and ‘bit_bar’ (the
compliment of ‘bit’). Each inverter drives one of the two state nodes, Q or Q0 . The
access operation is in the control of one word line labeled as WL.
When this cell is powered up, the two cross-coupled inverters enter a “power
struggle”; the winner will be ultimately decided by the difference in the driving
strength of the MOSFETs in the cross-coupled inverters. Essentially, the SRAM
cell has three possible states, two of these are “stable” and the third is “metastable”
as shown in Fig. 2.16. If the transistors in the cross-coupled inverters circuits are
perfectly matched, then the SRAM may remain in a metastable state “forever” when
first powered up. In reality, although those transistors are designed to have identical
nominal sizes, random variations in the silicon manufacturing process ensure that
one inverter has a stronger driving current than that of the other inverter, this helps
define the initial start-up value for the cell.
The majority of SRAM cells have preferred cell-specific initial state, which they
consistently assume when powered up, this characteristic of SRAM memories
allows them to be used for PUF constructions. The number of challenge/response
pairs obtained from an SRAM-based PUF is proportional to the size of the memory,
in this case, the challenge will be the reading address of the memory and the
response is the start-up values of the addressed cells. For example, a 64 megabits
Byte-addressable memory has 8 megabits of challenge/response pairs.
Definition 2.1
(a) Hamming Distance The Hamming distance d(a, b) between two words
a = (ai) and b = (bi) of length n is defined to be the number of positions
where they differ, that is, the number of (i)s such that ai 6¼ bi.
(b) Hamming Weight Let 0 denotes the zero vectors: 00 . . . 0, The Hamming
Weight HW(a) of a word a = a1 is defined to be d(a, 0), the number of
symbols ai != 0 in a.
2.9.1 Uniqueness
Take for example the two instances of a PUF depicted in Fig. 2.18, which are
assumed to be implemented on two different chips. When a challenge (011101) is
applied on both circuits, each PUF outputs a distinct response. the Hamming dis-
tance between these is 2, this means 25% of the total response bits are diffident.
Ideally Uniqueness should be close to 50%.
2.9 Evaluation Metrics of PUF Devices 39
2.9.2 Reliability
From the intra-chip HD value, the reliability of a PUF can be defined as:
Take for example the PUF circuit depicted in Fig. 2.19. When a challenge
(011101) is applied on this circuits at two different ambient temperatures, we
ideally expect the hamming distance between the responses to be 0, however as can
be seen from Fig. 2.19, in this case, the hamming distance is 1 (i.e. 10% difference
from the original response), this indicates this PUF design is not a very reliable
design.
2.9.3 Uniformity
1X k
Uniformity ¼ ri 100% ð2:13Þ
k i¼1
where k is the total number of responses and ri is the Hamming Weight of the ith
response
where:
CRP is the total number of challenge/response pairs.
Ri(l) and Rj(l) are the responses of the authentic and tampered with chip
respectively for a specific challenge (l).
If the PUF is tampered resistant against a specific physical modification, then the
metric above should be 50% i.e. the modified PUF has a distinctly different
behavior from that of the original PUF.
This section discusses the design process of a delay based PUF from the design
rationale, through its detailed structure and silicon implementation, to its hardware
testing and evaluation.
2.10 Case Study: FPGA Implementation of a Configurable Ring-Oscillator PUF 41
The top level architecture of the proposed design is the same as shown in Fig. 2.9
above, however its distinctive feature PUF is the re-configurable nature of the ring
oscillator’s chains which allows a significant increase in the number
challenge-response pairs. The proposed PUF is composed of K ring oscillators,
N multiplexers and a delay module. An implementation example is shown in
Fig. 2.20, where in (K = 4, N = 4). The number of inverters in each ring oscillator
(i.e. number of columns) is equal N + 1 (taking into consideration the NAND gate).
The operation principles of the above design are as follows: the challenge is
applied at the SEL inputs; then the multiplexers select specific physical path for
each ring oscillator, which generates two unique configurations for the ring oscil-
lators, the outputs of which are compared using the counters and comparator logic.
A unique response will be then generated. The delays of the physical paths of the
two selected ring oscillators must be perfectly balanced, otherwise the PUF
response will not be determined by the process variations-induced differences in the
42 2 Physically Unclonable Functions: Design Principles …
delay, but rather by the routing biases. This issue can be resolved by using hard
macros and custom layout techniques as shown in [24, 45], however; these methods
may be difficult to replicate and are only applicable to specific types of FPGAs
architectures.
In this design, minimal routing constraints are imposed on the design, because
symmetric routing requirements of inverter-chairs are satisfied using configurable
delay elements, as shown in Fig. 2.20.
The use of such configurable architecture allows a significant increase in the
number of CRPs, at every multiplexer stage two inverters are selected out of
ðK:ðK 1Þ=2Þ possible choices. For an N stage design, there will be ðK:ðK 1ÞÞN
possible physical configurations of the two ring oscillators, whose outputs are to be
compared. Each of these generates a unique response. Therefore, the number of
challenge/response pairs that can be achieved is given in Eq. (2.15):
N
K:ðK 1Þ
CRPs ¼ ð2:15Þ
2
2.10 Case Study: FPGA Implementation of a Configurable Ring-Oscillator PUF 43
This design offers a significantly larger number of CRPs compared to the tra-
ditional design in as shown in Fig. 2.21. The comparison was performed for a PUF
with N = 4 stage, with different number of ring oscillators (i.e. K).
Figure 2.22 depicts the The Number of CRPs which can be obtained as
a function of the number of Multiplexing Stages (N) for the Configurable RO
PUF, it can be seen that for a 14 stage design, one can obtain more than one billion
CRPs.
Fig. 2.22 The number of CRPs versus the number of multiplexing stages (N) for the configurable
RO PUF
44 2 Physically Unclonable Functions: Design Principles …
Fig. 2.24 The evaluation of the PUF uniqueness using normalized inter-chips hamming distance
Fig. 2.25 The evaluation of the PUF reliability using normalized intra-chip hamming distance
this metric, standard error correction algorithms can easily enhance improve this
figure, the use of such algorithms to enhance the reliability of PUF designs will be
discussed in a greater depth in Chap. 4.
To evaluate the uniformity metric, the Hamming Weight of responses from the
same device to different challenge has been estimated. The results, depicted in
Fig. 2.26, indicate that the design under consideration has a Uniformity of around
53%, which is very close to the ideal value, which indicate that this design is highly
unpredictable, hence more resilient to machine learning modelling attacks.
46 2 Physically Unclonable Functions: Design Principles …
Fig. 2.26 The evaluation of the PUF uniformity using normalized hamming weight
ensure that the PUF is able to consistently generate the same response for a chosen
challenge, so in this case the osc collapse [46] in Table 2.1 is the best bet, as it has
the most stable response.
On the other hand, if the PUF is to be used for the authentication of energy–
constrained devices, then uniqueness, reliability, and energy efficiency need to be
considered, so in this case INV_PUF [47] can provide an acceptable trade-off, as it
has the lowest energy consumption and an acceptable reliability figure. More
comparison data of existing implementations of PUF architectures can be found in
[48].
2.12 Conclusions
• A PUF circuit typically consists of two stages, the first stage converts inherent
process variation to a measurable quantity (e.g. voltage, current and delay), the
latter is then converted to a binary response in the second stage.
• The technology with which the PUF circuit is to be fabricated, needs to have
sufficient process variations, otherwise the difference between silicon chips is
going to be too small to generate unique identifiers.
• The PUF circuit should generate consistent responses under different environ-
ment conditions; this is measured using the reliability metric, which is calculated
using the average Hamming distance between responses from the same chip to a
chosen challenge under various environment conditions. Ideal value for the
reliability metric is “0”.
• The PUF circuit should generate a unique response for each chip for the same
challenge; this is measured using the uniqueness metric which is calculated
using the average Hamming Distance between responses from the different chip
to the same challenge. Ideal value for the Uniqueness metric is “0.5”.
• The PUF circuit should produce unpredictable responses; one of the indicators
of unpredictability is the average Hamming weight of its responses that should
ideally be 0.5.
48 2 Physically Unclonable Functions: Design Principles …
2.13 Problems
2:1. Figure 2.27 shows a synchronous Arbiter PUF circuit, where in the arbiter
block is realised using the structure shown in Fig. 2.7. This PUF design is
implemented in 130 nm technology where Ʈ = 66 ps. Due to the very small
difference in the delay of the two inverter chains, a metastability event occurs
every time a challenge is applied to the input of this circuit. You are given the
following information; the clock frequency is 100 MHz, the voltage at the
output of the gate G1 (in Fig. 2.7) in the arbiter block is ðV0 ¼ 0:2vÞ, and its
threshold voltage is ðVth ¼ 0:6vÞ.
2:2. How many stages of an arbiter PUF based on the structure shown in Fig. 2.8,
do you need to obtain a number of challenge/response pairs equivalent to that
of a generic RO PUF (see Fig. 2.9) which has 10 ring oscillators.
2:3. Self-timed rings may not oscillate unless their initial state is setup correctly.
One way to avoid a deadlock state in the STR PUF design is to store valid
initial states as bit patterns in is an on-chip non-volatile memory; these can be
used to initiate the oscillation. What are the bit patterns a designer would need
to store to ensure the oscillation of a four stage self-timed ring that has the
same structure as the one described in Fig. 2.11?
2:4. A two-choose-one (TCO) PUF is designed to have 512 challenge/response
pairs, assuming the size of its comparator circuit is equivalent to 6 transistors.
2:5. A DRAM memory, which has 64 cells arranged in 16 rows and 4 columns, is
used to design a physically unclonable function as described in Sect. 2.7,
where the size of the reserved region is 12 cells. Assuming reserved regions
can only be defined using three adjacent rows:
2:6. A PUF circuit, which has 3-bits challenge and 5-bits responses, is imple-
mented on four chips. The challenge/response behaviour of this is design is
summarised in Table 2.2.
2:7. A PUF circuit, which has 3-bits challenge and 5-bits responses, was imple-
mented and given to Alice to use as an access authentication device. Steve, a
known adversary of Alice, managed to gain access and maliciously changed
the PUF circuit to insert a small passive component, which works as an
antenna, this has led to some changes of the behaviour of the original PUF as
50 2 Physically Unclonable Functions: Design Principles …
Table 2.3 Challenge/response behaviour for authentic versus “Tampered with” PUF design
Challenge Device
Original chip “Tampered with” chip
000 11010000 11010001
001 11000010 11000000
010 10011000 10011001
011 11000000 11101001
100 10000111 11001111
101 11111110 11011000
110 10000111 01000101
111 11101001 11101001
summarised in Table 2.3. Analyse the data provided below and explain
whether or not Alice’s PUF can be considered resistant to Steve’s tampering
attack.
References
1. L. Daihyun, J.W. Lee, B. Gassend, G.E. Suh, M.V. Dijk, S. Devadas, Extracting secret keys
from integrated circuits. IEEE Trans. Very Large Scale Integr. VLSI Syst. 13, 1200–1205
(2005)
2. T. Fournel, M. Hébert, Towards weak optical PUFs by random spectral mixing, in 2016 15th
Workshop on Information Optics (WIO) (2016), pp. 1–3
3. S. Dolev, L. Krzywiecki, N. Panwar, M. Segal, Optical PUF for non forwardable vehicle
authentication, in 2015 IEEE 14th International Symposium on Network Computing and
Applications (2015), pp. 204–207
4. D.R. Reising, M.A. Temple, J.A. Jackson, Authorized and Rogue device discrimination using
dimensionally reduced RF-DNA fingerprints. IEEE Trans. Inf. Forensics Secur. 10, 1180–
1192 (2015)
5. M.W. Lukacs, A.J. Zeqolari, P.J. Collins, M.A. Temple, RF-DNA fingerprinting for antenna
classification. IEEE Antennas Wirel. Propag. Lett. 14, 1455–1458 (2015)
6. W.E. Cobb, E.D. Laspe, R.O. Baldwin, M.A. Temple, Y.C. Kim, Intrinsic physical-layer
authentication of integrated circuits. IEEE Trans. Inf. Forensics Secur. 7, 14–24 (2012)
7. J.D.R. Buchanan, R.P. Cowburn, A.-V. Jausovec, D. Petit, P. Seem, G. Xiong et al., Forgery:/
‘Fingerprinting/’ documents and packaging. Nature 436, 475–475 (2005), 07/28/print
8. S. Nassif, K. Bernstein, D.J. Frank, A. Gattiker, W. Haensch, B.L. Ji et al., High performance
CMOS variability in the 65 nm regime and beyond, in IEEE International Electron Devices
Meeting (2007), pp. 569–571
9. A. Narasimhan, R. Sridhar, Impact of variability on clock skew in H-tree clock networks, in
International Symposium on Quality Electronic Design (2007), pp. 458–466
10. S.R. Nassif, Modeling and analysis of manufacturing variations, in IEEE Conference on
Custom Integrated Circuits (2001), pp. 223–228
11. A. Chandrakasan, W.J. Bowhill, Design of High-Performance Microprocessor Circuits
(Wiley-IEEE Press, 2000)
12. International Technology Roadmap for Semiconductors (www.itrs.net).
References 51
13. B.P. Wong, A. Mittal, Z. Gau, G. Starr, Nano-CMOS Circuits and Physical Design (Wiley,
Hoboken, New Jersey, 2005)
14. D. Marculescu, S. Nassif, Design variability: challenges and solutions at
microarchitecture-architecture level, in Design, Automation and Test Conference in Europe
(2008)
15. Wikipedia. Available: https://en.wikipedia.org
16. S. Graybeal, P. McFate, Getting out of the STARTing block. Sci. Am. 261 (1989)
17. B. Halak, A. Yakovlev, Fault-tolerant techniques to minimize the impact of crosstalk on phase
encoded communication channels. Comput. IEEE Trans. 57, 505–519 (2008)
18. I.M. Nawi, B. Halak, M. Zwolinski, The influence of hysteresis voltage on single event
transients in a 65 nm CMOS high speed comparator, in 2016 21th IEEE European Test
Symposium (ETS) (2016), pp. 1–2
19. I.M. Nawi, B. Halak, M. Zwolinski, Reliability analysis of comparators, in 2015 11th
Conference on Ph.D. Research in Microelectronics and Electronics (PRIME) (2015),
pp. 9–12
20. Y. Zhang, P. Wang, G. Li, H. Qian, X. Zheng, Design of power-up and arbiter hybrid physical
unclonable functions in 65 nm CMOS, in 2015 IEEE 11th International Conference on ASIC
(ASICON) (2015), pp. 1–4
21. L. Lin, S. Srivathsa, D.K. Krishnappa, P. Shabadi, W. Burleson, Design and validation of
arbiter-based PUFs for sub-45-nm low-power security applications. IEEE Trans. Inf.
Forensics Secur. 7, 1394–1403 (2012)
22. Y. Hori, T. Yoshida, T. Katashita, A. Satoh, Quantitative and statistical performance
evaluation of arbiter physical unclonable functions on FPGAs, in 2010 International
Conference on Reconfigurable Computing and FPGAs (2010), pp. 298–303
23. C.E.D. Yin, G. Qu, LISA: maximizing RO PUF’s secret extraction, in 2010 IEEE
International Symposium on Hardware-Oriented Security and Trust (HOST) (2010), pp. 100–
105
24. X. Xin, J.P. Kaps, K. Gaj, A configurable ring-oscillator-based PUF for Xilinx FPGAs, in
2011 14th Euromicro Conference on Digital System Design (2011), pp. 651–657
25. S. Eiroa, I. Baturone, An analysis of ring oscillator PUF behavior on FPGAs, in 2011
International Conference on Field-Programmable Technology (2011), pp. 1–4
26. S. Eiroa, I. Baturone, Circuit authentication based on Ring-Oscillator PUFs, in 2011 18th
IEEE International Conference on Electronics, Circuits, and Systems (2011), pp. 691–694
27. H. Yu, P.H.W. Leong, Q. Xu, An FPGA chip identification generator using configurable ring
oscillators. IEEE Trans. Very Large Scale Integr. VLSI Syst. 20, 2198–2207 (2012)
28. M. Delavar, S. Mirzakuchaki, J. Mohajeri, A Ring Oscillator-based PUF with enhanced
challenge-response pairs. Can. J. Electr. Comput. Eng. 39, 174–180 (2016)
29. A. Cherkaoui, L. Bossuet, C. Marchand, Design, evaluation, and optimization of physical
unclonable functions based on transient effect ring oscillators. IEEE Trans. Inf. Forensics
Secur. 11, 1291–1305 (2016)
30. L. Bossuet, X.T. Ngo, Z. Cherif, V. Fischer, A PUF based on a transient effect ring oscillator
and insensitive to locking phenomenon. IEEE Trans. Emerg. Top. Comput. 2, 30–36 (2014)
31. B. Halak, Y. Hu, M.S. Mispan, Area efficient configurable physical unclonable functions for
FPGAs identification, in 2015 IEEE International Symposium on Circuits and Systems
(ISCAS) (2015), pp. 946–949
32. D.J. Kinniment, Synchronization and Arbitration in Digital Systems (2007)
33. G.E. Suh, S. Devadas, Physical unclonable functions for device authentication and secret key
generation, in 2007 44th ACM/IEEE Design Automation Conference (2007), pp. 9–14
34. J. Murphy, M.O. Neill, F. Burns, A. Bystrov, A. Yakovlev, B. Halak, Self-timed physically
unclonable functions, in 2012 5th International Conference on New Technologies, Mobility
and Security (NTMS) (2012), pp. 1–5
52 2 Physically Unclonable Functions: Design Principles …
3.1 Introduction
1. Explain the physical origins of the major reliability issues affecting CMOS
technology.
2. Discuss how these issues can affect the usability of PUF technology.
3. Present a case study on the evaluation of the impact of CMOS aging on the
quality metrics of PUF designs.
54 3 Reliability Challenges of Silicon-Based Physically …
It is hoped that this chapter will give the reader an in-depth understanding of the
reliability challenges in nano-scale CMOS technologies and their effects on the
usability of silicon-based PUF designs.
The organisation of this chapter is as follows, Sect. 3.3 explains in depth the
physical mechanisms of CMOS aging, and discusses the susceptibility of PUF
designs to the different forms of aging-related circuits’ degradation. Section 3.4
explains the causes of typical temporal failure mechanisms, including radiation hits,
electromagnetic interference, thermal noise and ground bounces, this is followed by
a discussion on how these phenomena affect the different types of PUF architec-
tures. Section 3.5 presents a case study to illustrate how the quality metrics of a
PUF design can be assessed given a specific reliability challenge, the study focuses
on the impact of BTI aging on three different PUF architectures, namely: SRAM,
TCO and Arbiter. Conclusions and lessons learned are presented in Sect. 3.6.
Finally, a set of problems and exercises are included in Sect. 3.7.
Fig. 3.1 NBTI Aging based on R–D mechanism: a fresh PMOS, b stress phase, c recovery phase
56 3 Reliability Challenges of Silicon-Based Physically …
where
Cox is the oxide capacitance.
t is the operating time.
a is the fraction of the operating time during which a MOS transistor is under a
stress condition. It has a value between 0 and 1. a = 0 if the MOS transistor
is always OFF (recovery phase), while a = 1 if it is always ON (stress
phase).
Ea is the activation energy (Ea ≅ 0.1 eV).
kB is the Boltzmann constant.
TA is the aging temperature.
v is a coefficient to distinguish between PBTI and NBTI. Particularly, v equals
0.5 for PBTI, and 1 for NBTI.
K Lumps technology specific and environmental parameters.
K has been estimated to be 2.7 V 1=2 F 1=2 s1=6 by fitting the model with the
experimental results reported in [7] for a 32 nm high-k CMOS technology.
Aging induced by the HCI mechanism is caused by the generation of traps in the
gate oxide. These are created when channel current carriers are accelerated to a
sufficient energy to cross into oxide. Thus, it mostly happens when transistor is
pinched off [11]. Additionally, HCI mainly occurs in NMOS transistors as opposed
to PMOS since electrons obtain a greater mobility than holes, thereby could speed
up to a higher energy. Unlike BTI, the traps produced by HCI are permanent,
therefore, the recovery phase is negligible. Figure 3.2 illustrates the HCI mecha-
nism in an NMOS transistor.
where
c is a process-dependent constant.
Ids is the drain-to-source current.
W is the transistor width.
;it;e is the critical energy needed to overcome oxide interface barrier.
E is the electric field at the drain.
ke is the hot-electron mean-free path.
t is the fraction of the operating time during which a MOS transistor is under a
stress condition.
At the initial stage of the of interface-trap generation process, the rate is
reaction-limited, therefore, Nit(t) / t and n = 1; at later stage, the generation is
diffusion limited (n = 0.5).
Voltage drops across the gate oxide of CMOS transistors lead to the creation of
traps within the dielectrics, such defects may eventually join together to form a
conductive path through the stack which causes an oxide breakdown, hence a
device failure [12]. This mechanism is especially a concern in the technologies of
the nanometre range, because of the reduction of the critical density of traps needed
to form a conducting path through these thin layers, in combination with stronger
electric field across the dielectrics.
3.3.4 Electromigration
This is one of the major reliability issues that especially affect the integrity of
on-chip communication schemes. It is defined as an undesirable electromagnetic
coupling between switching and non-switching signal lines [24]. Failure mecha-
nisms caused by crosstalk in on-chip communication links can be classified into
two types as follows [25]:
(a) Functional failures are caused by static noise. The latter is defined as the noise
pulse induced on a quiet victim net due to switching of neighbouring aggres-
sors. If this pulse propagates through logic gates and reaches storage elements,
it may lead to sampling the wrong data.
(b) Timing Failures occur when the delay of a communication bus increases, which
leads to a timing violation, thus an error. The increase in the delay is caused by
a change in the coupling capacitance between adjacent wires due to the
switching activity of neighbouring nets (i.e. Miller effect).
v2 ¼ 4kB TR ð3:3Þ
The advances in the fabrication process of CMOS technologies, which allow the
construction of densely integrated circuits, have aggravated some of its inherent
reliability challenges. This may lead to degradation in the quality metrics of
silicon-based unclonable functions. The magnitude of such degradation and how
detrimental it is going to be to the usability of the PUF is very much dependent on
the nature of the systems incorporating the PUF circuit. For example, in high-speed
3D-IC chips, soft errors induced by electromagnetic interference and thermal noise
are likely to be the largest source of soft errors. On the other hand, in low power
embedded systems, electromagnetic noise may be less of an issue compared to
ground bounce induced by switching activities of power gating structures. The type
of the PUF design also greatly affects its susceptibility to certain physical mecha-
nisms of soft errors. For example, an Arbiter PUF is generally more vulnerable to
crosstalk-induced errors compared to memory-based designs (e.g. SRAM PUF).
This is because temporal electromagnetic coupling can increase the delay of one of
logic paths of an arbiter PUF, which may cause the latter to produce a different
response from what it would have generated without crosstalk noise.
In the case of SRAM PUFs, which is based on the use of start-up values of the
cells when the memory is powered up, electromagnetic interference is less likely to
be a major cause of soft errors.
62 3 Reliability Challenges of Silicon-Based Physically …
This section presents a case study on the evaluation of aging impact on the quality
metrics of PUF designs.
Modelling of aging effects is divided into two stages: the first is creating the aging-
induced degradation model and second, integrating such a model into simulation
tools. The R–D (reaction–diffusion) and the “lucky electron” models are used to
model BTI and HCI as explained in the above sections. The use of these models
requires extraction (i.e., through a curve fitting process) of some technology
dependent parameters of their analytical model from burn-in experiments. The
electrical parameters of the degraded devices are updated and integrated in their
simulation models
Automated reliability tools like Cadence RelXpert [41], and Synopsis HSPICE
MOSRA [42] apply similar processes with two stages of simulation. The first stage
is called prestress, where the tool computes the degraded electrical of aged MOS
transistors in the circuit based on circuit behaviour and on the built-in stress model,
including BTI and HCI effects. The second stage is called post-stress, where the
tool evaluates the impact of the degradation on the properties of the circuits (e.g.
performance, leakage power…).
Simulation Tip: The same process as above can be used to evaluate aging impact
on PUF designs. Access to the above aging tools is not necessarily required to carry
first-order analysis, for example, the reader can compute the drift in the threshold
voltages due to BTI, as given in Eq. (3.1) for a 32 nm technology. The new com-
puted threshold voltage should then be used to update the spice models for the 32 nm
technology library files. The aged models can then be used to build aged PUF design
for evaluation. A conceptual diagram of the aging estimation methodology is shown
in Fig. 3.5.
3.5 Case Study: Evaluating the Impact of Aging … 63
To illustrate how the above method can be applied, we consider three different PUF
designs as follows:
The above designs are implemented with 65-nm technology node and the
BSIM4 (V4.5) transistor model. The HSPICE MOSRA tools are then used to
estimate the degradation in threshold voltage induced by 10 years of BTI-induced
aging. The activity factor (a) is assumed 20% (see Eq. 3.1); the actual value of this
factor is application dependent.
To model the impact of process variation on the physical parameters of CMOS
transistors (e.g. effective length, effective width, oxide thickness and threshold
64 3 Reliability Challenges of Silicon-Based Physically …
voltage), we use Monte Carlo simulations with built-in fabrication standard sta-
tistical variation (3r variations) in TSMC 65nm technology design kit. The source
of threshold voltage variations is assumed to be caused by random dopant fluctu-
ations (RDFs) [43]. For each of the above designs, a 100 PUF instances are
modelled using Monte Carlo. Finally, the analysis has been done at a nominal
temperature and operating voltage (25 C and 1.2 V).
To facilitate the analysis of the PUF metrics for all above designs, 32-bit lengths
responses are considered. To estimate the impact of BTI aging on the reliability of
the PUF circuits, identical challenges are applied on both fresh and aged designs,
their respective responses are then compared, and in each case, the Hamming
distance (HD) between the fresh and aged responses is computed. The same
experiment is then repeated for the 100 instances of each PUF, and the bit error rate
is computed as the average of all computed Hamming distances. The results are
depicted in Fig. 3.6.
The TCO PUF seems to be resilient to BTI aging mechanism, this mainly due to
its differential and symmetric architecture. In other words, the BTI stress patterns
(in the form of challenge bits) applied to the Stochastic transistors in both of arrays
are identical, therefore, in principles, the degradation in their electrical parameter
should be identical. This effectively means comparing the current output of the two
arrays should yield the same results before and after aging. However, there is a
second-order effect that may cause error; in fact, BTI stress may affect the sym-
metry of the comparator, which makes it favour one input over the other. More
information on this can be found in [44].
Fig. 3.6 The impact of BTI aging on the reliability of PUF designs
3.5 Case Study: Evaluating the Impact of Aging … 65
The Arbiter PUF also has a symmetric structure; this means its devices are likely
to experience symmetrical stress patterns, assuming random challenges are applied
throughout the lifetime.
In the case of SRAM PUF, the impact of aging is more pronounced.
Let us discuss this more closely. In the technology under consideration, the
impact of PBTI is insignificant, so NBTI is the dominant aging mechanism. For a
SRAM cell, seen in Fig. 3.7, only the pull-up transistors, P1 and P2 would suffer
from NBTI. The pull-down transistors (N1 and N2) are only affected by PBTI,
which causes negligible degradation in their electrical parameters. P1 and P2
transistors are part a cross-coupled inverter, this means only one would be under
NBTI stress at any time. Such asymmetric stress conditions result in unbalanced
threshold voltage degradation of these two transistors, for example, if P1 is placed
under greater stress then its threshold voltage degradation will be bigger compared
to P2, which will reduce its driving capability, consequently, node Q is less likely to
be pulled up to be 1 once the cell is powered up.
If the initial start-up value of the above SRAM cell was 0 due to inherent process
variation, such asymmetric aging can change this value to “1”. This bit flipping
effect reduces the capability of SRAM PUF to generate consistent responses based
on its power-up values, which explains the results obtained in Fig. 3.6. Similar
results are obtained in [45], where authors employed the static noise margin to
quantify the impact of BTI on SRAM.
To estimate the impact of BTI aging on the uniqueness and uniformity metrics,
the fresh and aged designs are evaluated separately. A summary of the result is
shown in Table 3.1.
The impact of BTI aging mechanism on the uniqueness metric of the PUF
designs under consideration is negligible. Although errors induced by aging
mechanism affect the response of the PUF design, such random flipping of bits is
most likely to cancel out when computing the Hamming distance between these
responses, therefore, the uniqueness metric remains almost the same.
The same rational can be applied to explain the insignificant degradation of the
uniformity metric in the case studied here.
3.7 Problems
3:2. A resistor has a value of R = 5000 X, and temperature T = 340 k. What is the
magnitude of its thermal noise? Is this likely to cause an error in a SRAM PUF
fabricated using 32 nm technology node? It is assumed that the threshold
voltage of the MOS transistor in this technology is 0.53 V.
3:3. A PUF circuit, which has 3-bits challenge and 5-bits responses, is imple-
mented on four chips. The challenge/response behaviour of this design is
summarised in Table 3.2a below. The four chips are then subjected to
accelerated aging process. The challenge response behaviour of the aged chips
is summarised in Table 3.2b
1. Compute the average bit error rate caused by the aging process.
2. Table 3.3 shows the challenges response behaviour of the fresh and aged
PUFs under different supply voltages. Analyze the given data and explain
whether the aging process affects the robustness of the PUF design against
fluctuations in the supply voltage.
3:4. Which of the following PUFs are most likely to be affected by crosstalk noise?
3:5. Which of the depicted PUFs in Fig. 3.8 would you recommend as aging
resilient design? Explain your answer.
? Response y
x1 >
Counter
Ring Oscillator
References
1. V. Ferlet-Cavrois, L.W. Massengill, P. Gouker, Single event transients in digital CMOS—a
review. IEEE Trans. Nucl. Sci. 60, 1767–1790 (2013)
2. M. Agarwal, B.C. Paul, M. Zhang, S. Mitra, Circuit failure prediction and its application to
transistor aging, in VLSI Test Symposium, 2007. 25th IEEE (2007), pp. 277–286
3. S. Bhardwaj, W. Wang, R. Vattikonda, Y. Cao, S. Vrudhula, Predictive modeling of the NBTI
effect for reliable design, in Custom Integrated Circuits Conference, 2006. CICC’06. IEEE
(2006), pp. 189–192
4. R. Vattikonda, W. Wenping, C. Yu, Modeling and minimization of PMOS NBTI effect for
robust nanometer design, in Design Automation Conference, 2006 43rd ACM/IEEE (2006),
pp. 1047–1052
5. D. Rossi, M. Omaña, C. Metra, A. Paccagnella, Impact of bias temperature instability on soft
error susceptibility, in IEEE Transaction on Very Large Scale Integration (VLSI) Systems, vol
23 (2015), pp. 743–751
6. B.C. Paul, K. Kunhyuk, H. Kufluoglu, M.A. Alam, K. Roy, Impact of NBTI on the temporal
performance degradation of digital circuits. IEEE Electron Devices Lett. 26, 560–562 (2005)
7. H.I. Yang, C.T. Chuang, W. Hwang, Impacts of contact resistance and NBTI/PBTI on SRAM
with high-? Metal-gate devices, in IEEE International Workshop on Memory Technology,
Design, and Testing, 2009. MTDT’09 (2009), pp. 27–30
8. H.K.M.A. Alam, D. Varghese, S. Mahapatra, A comprehensive model for pmos nbti
degradation: recent progress. Microelectron. Reliab. 47, 853–862 (2007)
9. M. Fukui, S. Nakai, H. Miki, S. Tsukiyama, A dependable power grid optimization algorithm
considering NBTI timing degradation, in IEEE 9th International New Circuits and Systems
Conference (NEWCAS) (2011), pp. 370–373
10. K. Joshi, S. Mukhopadhyay, N. Goel, S. Mahapatra, A consistent physical framework for N
and P BTI in HKMG MOSFETs, in Reliability Physics Symposium (IRPS), 2012 IEEE
International (2012), pp. 5A.3.1–5A.3.10
11. X. Li, J. Qin, J.B. Bernstein, Compact modeling of MOSFET wearout mechanisms for
circuit-reliability simulation. IEEE Trans. Devices Mater. Reliab. 8, 98–121 (2008)
12. F. Jianxin, S.S. Sapatnekar, Scalable methods for the analysis and optimization of gate oxide
breakdown, in 2010 11th International Symposium on Quality Electronic Design (ISQED)
(2010), pp. 638–645
13. K. Weide-Zaage, Kludt, M. Ackermann, V. Hein, M. Erstling, Life time characterization for a
highly robust metallization, in 2015 16th International Conference on Thermal, Mechanical
and Multi-Physics Simulation and Experiments in Microelectronics and Microsystems (2015),
pp. 1–6
14. S. Moreau, D. Bouchu, Reliability of dual damascene TSV for high density integration: the
electromigration issue, in 2013 IEEE International Reliability Physics Symposium (IRPS)
(2013), pp. CP.1.1–CP.1.5
15. S. Moreau, Y. Beilliard, P. Coudrain, D. Bouchu, R. Taïbi, L. D. Cioccio, Mass
transport-induced failure in direct copper (Cu) bonding interconnects for 3-D integration, in
2014 IEEE International Reliability Physics Symposium (2014), pp. 3E.2.1–3E.2.6
16. J. Ziegler, W. Lanford, The effect of sea level cosmic rays on electronic devices, in 1980 IEEE
International Solid-State Circuits Conference. Digest of Technical Papers (1980), pp. 70–71
17. T.C. May, M.H. Woods, Alpha-particle-induced soft errors in dynamic memories. IEEE
Trans. Electron Devices 26, 2–9 (1979)
18. D.C. Matthews, M.J. Dion, NSEU impact on commercial avionics, in 2009 IEEE
International Reliability Physics Symposium (2009), pp. 181–193
19. S. Uznanski, R.G. Alia, E. Blackmore, M. Brugger, R. Gaillard, J. Mekki et al., The effect of
proton energy on SEU cross section of a 16 Mbit TFT PMOS SRAM with DRAM capacitors.
IEEE Trans. Nucl. Sci. 61, 3074–3079 (2014)
70 3 Reliability Challenges of Silicon-Based Physically …
20. J.G. Rollins, W.A. Kolasinski, D.C. Marvin, R. Koga, Numerical simulation of SEU induced
latch-up. IEEE Trans. Nucl. Sci. 33, 1565–1570 (1986)
21. Y. Lin, M. Zwolinski, B. Halak, A low-cost radiation hardened flip-flop, in 2014 Design,
Automation & Test in Europe Conference & Exhibition (DATE) (2014), pp. 1–6
22. C. Slayman, Soft error trends and mitigation techniques in memory devices, in 2011
Proceedings—Annual Reliability and Maintainability Symposium (2011), pp. 1–5
23. Y. Lin, M. Zwolinski, B. Halak, A low-cost, radiation-hardened method for pipeline
protection in microprocessors. IEEE Trans. Very Large Scale Integr. VLSI Syst. 24, 1688–
1701 (2016)
24. M.A. Elgamel, K.S. Tharmalingam, M.A. Bayoumi, Crosstalk noise analysis in ultra deep
submicrometer technologies, in IEEE Computer Society Annual Symposium on VLSI (2003),
pp. 189–192
25. B.P. Wong, A. Mittal, Z. Gau, G. Starr, Nano-CMOS circuits and physical design (Wiley,
Hoboken, New Jersey, 2005)
26. B. Halak, A. Yakovlev, Fault-tolerant techniques to minimize the impact of crosstalk on phase
encoded communication channels. Comput. IEEE Trans. 57, 505–519 (2008)
27. C. Duan, A. Tirumala, S.P. Khatri, Analysis and avoidance of cross-talk in on-chip buses, in
IEEE Conference on Hot Interconnects, (2001), pp. 133–138
28. M.A. Elgamel, K.S. Tharmalingam, M.A. Bayoumi, Noise-constrained interconnect opti-
mization for nanometer technologies. Int. Symp. Circ. Syst. 5, 481–484 (2003)
29. M. Lampropoulos, B.M. Al-Hashimi, P. Rosinger, Minimization of crosstalk noise, delay
and power using a modified bus invert technique. Des. Autom. Test Eur. Conf. Exhib. 2,
1372–1373 (2004)
30. J. Nurmi, H. Tenhunen, J. Isoaho, A. Jantsch, Interconnect Centric Design for Advanced SoC
and NoC (Kluwer Academic Publisher, Boston, 2004)
31. K.N. Patel, I.L. Markov, Error-correction and crosstalk avoidance in DSM busses, in IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, vol 12 (2004), pp. 1076–1080
32. D. Rossi, C. Metra, A.K. Nieuwland, A. Katoch, New ECC for crosstalk impact minimization.
Des. Test Comput. IEEE 22, 340–348 (2005)
33. D. Rossi, C. Metra, A.K. Nieuwland, A. Katoch, Exploiting ECC redundancy to minimize
crosstalk impact. Des. Test Comput. IEEE 22, 59–70 (2005)
34. A. Balasubramanian, A.L. Sternberg, B.L. Bhuva, L.W. Massengill, Crosstalk effects caused
by single event hits in deep sub-micron CMOS technologies. Nucl. Sci. IEEE Trans. 53,
3306–3311 (2006)
35. A.K. Palit, K.K. Duganapalli, W. Anheier, Crosstalk fault modeling in defective pair of
interconnects. Integr. VLSI J. 41, 27–37 (2008)
36. A. Kabbani, A.J. Al-Khalili, Estimation of ground bounce effects on CMOS circuits. IEEE
Trans. Compon. Packag. Technol. 22, 316–325 (1999)
37. S. Kim, C.J. Choi, D.K. Jeong, S.V. Kosonocky, S.B. Park, Reducing ground-bounce noise
and stabilizing the data-retention voltage of power-gating structures. IEEE Trans. Electron
Devices 55, 197–205 (2008)
38. A. Antonopoulos, M. Bucher, K. Papathanasiou, N. Mavredakis, N. Makris, R.K. Sharma
et al., CMOS small-signal and thermal noise modeling at high frequencies. IEEE Trans.
Electron Devices 60, 3726–3733 (2013)
39. Y.J. Lee, S.K. Lim, Co-optimization and analysis of signal, power, and thermal interconnects
in 3-D ICs. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 30, 1635–1648 (2011)
40. B. Halak, Partial coding algorithm for area and energy efficient crosstalk avoidance codes
implementation. IET Comput. Digit. Tech. 8, 97–107 (2014)
41. C.D.S. Inc., Virtuoso relXpert reliability simulator user guide. Technical Report (2014)
42. S. Inc, HSPICE User Guide: Basic Simulation and Analysis. Technical Report (2013)
43. Y. Ye, F. Liu, M. Chen, S. Nassif, Y. Cao, Statistical modeling and simulation of threshold
variation under random dopant fluctuations and line-edge roughness. IEEE Trans. Very Large
Scale Integr. VLSI Syst. 19, 987–996 (2011)
References 71
44. M.S. Mispan, B. Halak, M. Zwolinski, NBTI aging evaluation of PUF-based differential
architectures, in 2016 IEEE 22nd International Symposium on On-Line Testing and Robust
System Design (IOLTS) (2016), pp. 103–108
45. M. Cortez, A. Dargar, S. Hamdioui, G.J. Schrijen, Modeling SRAM start-up behavior for
Physical unclonable functions, in 2012 IEEE International Symposium on Defect and Fault
Tolerance in VLSI and Nanotechnology Systems (DFT) (2012), pp. 1–6
Reliability Enhancement Techniques
for Physically Unclonable Functions 4
4.1 Introduction
The reliability of a physically unclonable device refers to its ability to generate the
same response repeatedly given a specific challenge. This metric is an important
factor that a designer must take into consideration when deciding on the suitability
of a particular PUF architecture for a certain application. For example, crypto-
graphic key generation schemes require the use of PUF devices which have 100%
reliability (i.e. zero bit error rate), otherwise, encryption/decryption will not be
possible. On the other hand, some classes of authentication applications can tolerate
a certain level of bit error (i.e. the silicon device can be authenticated if its PUF
circuit generates a response whose Hamming distance from the expected response
is smaller than a predefined upper bound).
There are a number of mechanisms that cause the PUF responses to deviate from
their expected values, these include variability in the environment conditions (i.e.
temperature and power supply), electromagnetic interferences (i.e. ground bounce
and crosstalk), radiation hits and aging of the underlying CMOS devices.
In order for a designer to ensure that the reliability of PUF responses adheres to
the application requirements, they need first to evaluate the expected bit error rate at
the output of the PUF and then apply the appropriate measures accordingly.
Reliability enhancement techniques of PUF designs can generally be classified
into two categories, the first is based on the use of error correction codes and the
second is based on pre-processing methods. The latter are applied at the
post-fabrication stage before the chips are deployed to the field, the aim of such
approaches is to reduce the likelihood of bit flipping, hence reducing or completely
removing the need for expensive error correction schemes.
74 4 Reliability Enhancement Techniques for Physically …
It is hoped that this chapter will give the reader the necessary theoretical
background and skills to be able to implement appropriate reliability enhancements
methods, which meet the specifications of their applications and the constraints of
their design budgets.
The organisation of this chapter is as follows. Section 4.3 explains how to compute
the expected bit error rate of a PUF, given a number of temporal noise sources.
Section 4.4 presents a comprehensive tutorial on the design principles of error
correction codes supported by examples and worked solutions. Section 4.5 outlines
the design flow of reliable PUFs, including the details of each phase. Section 4.6
explains how the reliability of silicon-based PUFs can be enhanced using aging
acceleration techniques. Section 4.7 describes the principles of the stable bit
selection approach and gives two examples of how this technique can be imple-
mented in practice to enhance the robustness of PUF designs. Section 4.8 explains
the methods used for a reliable on-chip response construction, in particular, it
discusses the use of secure sketch schemes, temporal majority voting and hardware
redundancy. Section 4.9 analyses the costs associated with the above reliability
enhancement approaches and discusses the implications of such costs on the use of
PUF technology in resources constrained systems. Conclusions and lessons learned
are presented in Sect. 4.10. Finally, a list of problems and exercises are provided in
Sect. 4.11.
There are a number of physical sources of errors (e.g. aging, radiation ,etc.). If these
contributions are assumed independent and additive, the overall error rate for a
response bit (ri ) can be given as follows:
4.3 The Computation of the Bit Error Rate of PUFs 75
XF
Pe ðri Þ ¼ j¼1
Pej ðri Þ; ð4:1Þ
where
F is the total number of failure mechanisms.
Pej is the error rate caused by a failure mechanism j.
To compute the average error rate of all response bits, caused by a specific
source of errors, we can use the formulae given below.
1 Xk HD Ri ðnÞ; R0i ðnÞ
Pej ¼ 100%; ð4:2Þ
k i¼1 n
where
n The length of the PUF response (bits).
HD Ri ðnÞ; R0i ðnÞ The Hamming distance between response Ri(n) from the chip
i at noise-free operating conditions and the same n-bit response
obtained at noisy conditions R′i (n), respectively for the
challenge C.
K The number of sample chips.
We can also compute the average overall error rate caused by all sources of noise
as follows:
1 XF
Pe ¼ Pj
j¼1 e
ð4:3Þ
F
It is possible in some cases to pre-compute the error rate using simulation models
or test chips. Figure 4.1 shows how the bit error rate caused by temperature vari-
ations in different PUF architectures, namely:
The above designs are implemented using 65-nm technology node and the
BSIM4 (V4.5) transistor model. For each of the above designs, a 100 PUF instances
are modelled using Monte Carlo simulations. To facilitate the analysis of the PUF
metrics for all of the above designs, 32-bit lengths responses are considered.
The responses of each case are recorded at three different temperature (−40, 0 and
85 °C).
76 4 Reliability Enhancement Techniques for Physically …
Fig. 4.1 The impact of temperature variations on the bit error rate of PUF architectures
4.4.1 Preliminaries
Definition 4.1
(a) Hamming Distance The Hamming distance d(a, b) between two words
a = (ai) and b = (bi) of length n is defined to be the number of positions,
where they differ, that is, the number of ðiÞs such that ai 6¼ bi.
(b) Hamming Weight Let 0 denotes the zero vector: 00 … 0, The Hamming
weight HW(a) of a word a = a1 … an 2 V is defined to be d(a, 0), the
number of symbols ai ! = 0 in a.
Example 4.3 What is the minimum distance of the code: C = {001, 011, 111}?
And what is the Hamming weight of (011).
78 4 Reliability Enhancement Techniques for Physically …
This can be computed by calculating the Hamming distances between each two
words and then finding the minimum number, in this case, the answer will be one.
HWð011Þ ¼ 2:
Example 4.4 Let C be the following repetition binary code C = {000, 111}, the
minimum distance of this code is 3, this means it should be able to correct one error
(t = 1). This can be clearly justified form observing the fact that a single error will
transform the codeword (000) to another word (001), and using the nearest
neighbouring principle, one can deduce that (001) is closer to (000) than to (111),
so the receiver can correct this error and retrieve the originally submitted codeword.
The example above shows that ultimately a price has to be paid to correct errors,
that is the sender will need to transmit more bits than otherwise required; those
extra bits will be used for error correction.
We are now going to introduce a number of mathematical techniques useful for
constructing error correction codes.
4.4.2 Groups
Example 4.5 The number systems Z, Q, R, C and Zn are groups under addition,
with * = +, e = 0 and ai = −a.
A group G is said to be commutative (or abelian) if (a * b) = (b * a) for all a,
b 2 G. (commutativity).
For example, the sets of non-zero elements in Q, R and C are all commutative
groups under multiplication.
A group G is said to be cyclic if it has a generator element g 2 G such that every
element a 2 G has the form a = gi (or ig in additive notation) for some integer i.
4.4 The Design Principles of Error Correction Codes 79
For example: Zn is cyclic, since every element has the form 1 + 1+ + 1. Z is
also cyclic, since every element has the form 1 + 1+ + 1. However, Q, R and
C are not cyclic.
For example, Q, R and C are all fields, but Z is not because the nonzero integer ±2,
±3 ,etc. do not have multiplicative inverses in Z (though they do in Q).
For example, Z is a ring. Another ring is Zn , the latter is the set {0, 1 … n − 1},
where addition and multiplication defined as follows:
1. +: a + b in Zn = (a + b) mod n;
2. .: a . b in Zn = ab mod n.
It is useful at this stage to introduce the theorem below, which helps find fields:
so the nonzero elements are not closed under multiplication, and cannot form a
group. Thus, Z6 is not a field.
A field containing only finitely many elements is called a finite field.
One of the most widely used field in computing applications is the binary field
Z2 = {1, 0}, Table 4.1 shows the addition and multiplication tables in this field.
We are now going to use the mathematical construction above in order to learn
how to design linear codes, the most widely used form of error correction methods.
Consider the binary field Zn = {0,1}, consider a set of words of length n (a = {a1,
a2 … an}, b, c …) constructed from elements from Z2, for example, a = {1101}.
These words can be considered as vectors.
where
ð; þ Þ are addition and multiplication modulo n over the group, Zn
is a vector addition,
* is a multiplication by a scalar (i.e. multiplication of a vector v 2 V by
an element a 2 Zn ).
4.4 The Design Principles of Error Correction Codes 81
For example, consider the vector spaces defined over the binary field Z2 = {0, 1}:
V2 ¼ f0; 1g
V4 ¼ f00; 01; 10; 11g
V8 ¼ f000; 001; 010; 011; 100; 101; 110; 111g
For example, S1 = {100, 010, 001} is linearly independent because the equation
a1 (100) + a2 (010) + a3 (001) = 0 can only be satisfied if a1 = a2 = a3 = 0.
On the other hand, S2 = {100, 010, 110} is not linearly independent because the
equation a1 (100) + a2 (010) + a3 (110) = 0 can be satisfied when a1 = a2 =
a3 = 0 or when a1 = a2 = a3 = 1.
We are going to discuss two methods to construct linear codes; the first approach
uses a set of linearly independent vectors as a basis for the code. The second
technique uses a set of linear equations characterising the codewords.
ci ¼ ai1 c1 þ ai2 c2 þ þ aik ck where ai1; ai2 . . .aik 2 Z2
For example, assume C = {c1, c2, c3, c4} has the basis L = {c1, c2}, this means
that:
c1 ¼ a11 c1 þ a12 c2
c2 ¼ a21 c1 þ a22 c2
c3 ¼ a31 c1 þ a32 c2
c4 ¼ a41 c1 þ a42 c2 ; where fa11 ; a12 ; . . .a42 g 2 Z2
It is worth noting here that a binary code with a basis of k codewords will have
k
2 codewords (i.e. it can encode k information bits).
For example, the code C3 = {0000, 0001, 1000, 1001} has the basis:
L = {0001, 1000}.
The dimension of code C with length n is defined as the number of basis vectors
of this code, for example, k(C3) = 2, therefore, n − k is the number of extra bits
needed for error correction or detection (i.e. carry bits). The rate of linear code C
with length n and a basis of k vector is defined as ðnkÞ.
For example, the code C3 = {0000, 0001, 1000, 1001} has the basis
L ¼ f0001; 1000g:
It is worth noting here that the codewords of a code C with a generator matrix
G (k n) can be calculated by multiplying all binary data words of size k by the
generator matrix.
Example
4.6 Calculate
the codewords of code C3 with the generator matrix:
0 0 0 1
G¼ :
1 0 0 0
4.4 The Design Principles of Error Correction Codes 83
Solution:
In this case, n = 4 and k = 2. Therefore, there are 2k = 4 codewords:
0 0 0 1
c1 ¼ ½ 0 0 ¼ ½0 0 0 0
1 0 0 0
0 0 0 1
c2 ¼ ½ 0 1 ¼ ½1 0 0 0
1 0 0 0
0 0 0 1
c3 ¼ ½ 1 0 ¼ ½0 0 0 1
1 0 0 0
0 0 0 1
c4 ¼ ½ 1 1 ¼ ½1 0 0 1
1 0 0 0
We should note here that the generator matrix of a code C could be transformed
into an equivalent systematic form using elementary row operation (which do not
change the set of codewords) and column interchange.
For example, the generator matrix G below can be transformed into its sys-
tematic form (Gc) by interchanging the first and the seventh column
2 3
0 0 0 0 1 1 1
61 07
6 1 0 0 1 0 7
G¼6 7
41 0 1 0 0 1 05
1 0 0 1 1 1 0
2 3
1 0 0 0 1 1 0
60 17
6 1 0 0 1 0 7
GC ¼ 6 7
40 0 1 0 0 1 15
0 0 0 1 1 1 1
The main advantage of systematic codes is their ability to separate data and
parity bits in the codewords. Take for example, the generator metric GC in the
above example.
Let ci ¼ ½c1; c2; c3; c4; c5; c6; c7 be a codeword generated from the data word
di ¼ ½d1; d2; d3; d4 , we can write,
84 4 Reliability Enhancement Techniques for Physically …
c i ¼ di G C
c1 ¼ d1
c2 ¼ d2
c3 ¼ d3
c4 ¼ d4
c5 ¼ c1 þ c2 þ c4
c6 ¼ c1 þ c3 þ c4
c7 ¼ c2 þ c3 þ c4
The separation of data and parity bits can greatly reduce the complexity of
hardware implementation of error correction codes.
aH T ¼ 0;
where
2 3
h11 h1n is called the parity-check matrix size (n − k, n),
6 . .. .. 7
H ¼ 4 .. . . 5
hm1 hmn
HT is the transpose of H,
0 denotes the (k, n − k) zero matrix,
a = {a1, a2…an} represent a codeword from C.
If there are m independent equations, then C has dimension k = n − m.
Let us now define the syndrome s of a codeword (a) from a code C, with a
parity-check matrix H, as:
4.4 The Design Principles of Error Correction Codes 85
s ¼ aH T
It should be noted here that for all valid codewords s = 0 by definition, this
property can be used to detect the presence of errors.
For example, if an erroneous codeword (a′) is received then,
s0 ¼ aH T 6¼ 0
Hamming codes are one of the most widely known examples of linear codes due to
their relatively simple constructions.
For each integer m 2, there is a Hamming code C defined over the binary field
Z2 and has the following parameters:
• Code length: n = 2m – 1;
• Dimension: k = 2m – m – 1;
• Minimum distance: d = 3.
The easiest way to describe a Hamming code is through its parity-check matrix.
First, we construct an (m n) matrix whose columns are all non-zero binary
m-tuples. For example, for a [7, 4] binary Hamming code, we take m = 3, so n = 7
and k = 4, we can obtain the parity-check matrix:
2 3
1 0 1 0 1 0 1
H ¼ 40 0 1 0 0 1 15
0 0 0 1 1 1 1
Second, in order to obtain a parity-check matrix for a code in its systematic form,
we move the appropriate columns to the end so that the matrix ends with the
(m m) identity matrix. The order of the other columns are irrelevant. The result is
the parity-check matrix H for a Hamming [n, k] code. In our example, we obtain:
2 3
1 1 0 1 1 0 0
Hc ¼ 4 1 0 1 1 0 1 05
0 1 1 1 0 0 1
86 4 Reliability Enhancement Techniques for Physically …
Third, based on H we can easily calculate the generator matrix (G) as follows:
G ¼ ½ Ik P ;
where
H ¼ PT In
k
2 3 2 3
1 1 0 1 1 0 0 1 1 0 1
H ½7; 4 ¼ 4 1 0 1 1 0 1 0 5 this means PT ¼ 4 1 0 1 15
0 1 1 1 0 0 1 0 1 1 1
2 3
1 0 0 0 1 1 0
60 1 0 0 1 0 17
So G ¼ 6
40
7
0 1 0 0 1 15
0 0 0 1 1 1 1
Now we have both H and G matrixes, we can proceed with encoding and
decoding processes.
The codeword (c) of each data (d) word can be obtained by multiplication with
the generator matrix
c ¼ d:G
¼ ½1 0 0 1 0 0 1
We notice that the first (k = 4) bits of the codeword is the same as the data word,
this is because we have used a systematic form of the G matrix, which makes
decoding easier.
The decoding process has two stages: the first checks the validity of received
codewords and makes correction, the second removes redundant bits (i.e. parity bits).
This process assumes that the binary channel is symmetric, which means the proba-
bilities of a bit “0” flip equal to the probability of bit “1” flip. The maximum likelihood
decoding strategy is adopted, which means an erroneous codeword should be restored
to a valid codeword that is closest to it in terms of Hamming distance. In case of a one
error correcting Hamming code, only one error is assumed per codeword. The
decoding process can be summarised in the following four steps:
4.4 The Design Principles of Error Correction Codes 87
As long as there is at most one-bit error in the received vector, it will be possible
to obtain the original codeword that was sent.
Example 4.7 The [15, 11] binary systematic Hamming code has parity-check
matrix
0 1
0 0 0 0 1 1 1 1 1 1 1 1 0 0 0
B1 1 1 0 0 0 0 1 1 1 1 0 1 0 0C
B C
@0 1 1 1 0 1 1 0 0 1 1 0 0 1 0A
1 0 1 1 1 0 1 0 1 0 1 0 0 0 1
c ¼ ð0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 Þ:
Cyclic codes are a special type of codes, which allow for efficient hardware
implementation; therefore, they can be particularly useful for resource-constrained
systems such as IoT devices. In this section, we will learn their design principles;
we will also explain how they can be realised in hardware.
88 4 Reliability Enhancement Techniques for Physically …
4.4.7.1 Fundamentals
For example, the code C1 = {000, 101, 011, 110} is cyclic, whereas the code
C2 = {0000, 1001, 0110, 1111} is not cyclic.
Let us take another example, consider the code C3 that has the following gen-
erator matrix:
0 1
1 0 1 1 1 0 0
G ¼ @0 1 0 1 1 1 0A
0 0 1 0 1 1 1
A closer inspection of the codewords, reveals the fact that this is a cyclic code
because a right cyclic of each codeword would produce another valid codeword, for
example, shifting c1 produces c1 and so on.
We are now going to introduce the mathematical tools needed to construct cyclic
codes.
Theorem 4.3 The set of polynomials over a field F of degree less than n form
a ring with respect to addition and multiplication modulo (Xn + 1). This ring
will be denoted Fq[X]/(Xn + 1).
+ 0 1 x 1+x
0 0 1 x 1+x
1 1 0 1+x x
x x 1+x 0 1
1+x 1+x x 1 0
Proof
x:að xÞ ¼ x: a0 þ a1 x þ . . . an
1 xn
1
x:að xÞ ¼ a0 x þ a1 x2 þ . . . þ an
2 xn
1 þ an
1 xn
x:að xÞ ¼ a0 x þ a1 x2 þ . . . þ an
2 xn
1 þ an
1
x:að xÞ ¼ an
1 þ a0 x þ a1 x2 þ . . . þ an
2 xn
1
90 4 Reliability Enhancement Techniques for Physically …
To construct a cyclic code, we will need to define a condition that can be only
satisfied by valid codewords. To do this, we can use the concept of the generator
polynomial g(x) as defined formally below.
Theorem 4.5 Let C be an (n, k) linear cyclic code over the ring Rn = Z[x]/
(Xn + 1)
The fourth claim of the previous theorem gives a recipe to get all cyclic codes of
a given length n.
Indeed, all we need to do is to find all factors of xn + 1, which can consequently
be used as generator polynomials.
Example 4.8 Find all binary cyclic codes of length 3.
Solution:
According to Theorem 4.5 (claim 4), the generator polynomial should be a
divisor of x3 + 1:
First, we need to find all divisors of x3 + 1
x3 þ 1 ¼ x3 þ 1 :1
x3 þ 1 ¼ ðx þ 1Þ x2 þ x þ 1
This shows that we have four divisors (x3 + 1), 1, (x + 1) and (x2 + x + 1), each
of these can be a generator of a cyclic code. Therefore, we have in total four cyclic
codes.
Second, in order to obtain the codewords for each of these codes, we multiply
the corresponding generator polynomial by the elements of R3 = {0, 1, x, 1 + x, x2,
1 + x2, x + x2, 1 + x + x2}.
This results in a code with k = 2 and k = 3. It is called a one parity cyclic code,
capable of detecting one error,
gð xÞ ¼ g0 þ g1 x þ . . . þ gr xr ;
where r = n – k.
Then, dim (C) = n − r and a generator matrix G for C is
0 1
g0 g1 g2 . . . gr 0 0 0 ... 0
B 0 g0 g1 g2 . . . gr 0 0 ... 0 C
B C
G¼B
B 0 0 g0 g1 g2 . . . gr 0 ... 0 C
C
@... ... ...A
0 0 ... 0 0 ... 0 g0 ... gr
Proof First, we can easily notice that all rows of G are linearly independent.
Second, the n–r rows of G represent codewords: g(x), xg(x), x2g(x), …, xn−r−1g(x).
Finally, It remains to show that every codeword in C can be expressed as a linear
combination of vectors from g(x), xg(x), x2g(x), …, xn−r−1g(x).
Indeed, if a(x) 2 C, then
að xÞ ¼ qð xÞgð xÞ:
For example, consider the code C = {000, 111} with g(x) = x2 + x + 1, in this
case, n = 3, k = 1 and the generator polynomial will have the dimensions [3, 1] and
can be constructed as follows:
G ¼ ½1 1 1 :
Another example is the code C = {000, 110, 011, 101} with g(x) = x + 1, in
this case, n = 3, k = 2 and the generator polynomial will have the dimensions [3, 2]
and can be constructed as follows:
1 1 0
G¼
0 1 1
hð xÞ ¼ h0 þ h1 x. . . þ hk xk
0 1
hk hk
1 hk
2 ... h0 0 0 0 ... 0
B 0 hk hk
1 hk
2 ... h0 0 0 ... 0 C
B C
H¼B
B 0 0 hk hk
1 hk
2 ... h0 0 ... 0 C
C
@... ... ...A
0 0 ... 0 0 ... 0 hk . . . h0
Example 4.9 Find the parity check polynomial and its corresponding matrix for the
code: C = {000, 110, 011, 101} with g(x) = x + 1 defined over R3 = Z[x]/
(X3 + 1).
From Definition 4.10, the parity check polynomial is given as follows:
X 3 þ 1 ¼ gðxÞhð xÞ
This means:
x3 þ 1
hð x Þ ¼ ¼ x2 þ x þ 1
xþ1
based on this, the parity-check matrix with the dimension H(n − k, n) = H(1, 3) can
be constructed as follows:
H ¼ ½1 1 1
The parity check polynomial can be used to check the validity of received
codewords using the theorem 4.7 given below:
Example 4.10 Consider code: C = {000, 110, 011, 101} with h(x) = x3 + x + 1
defined over R3 = Z[x]/(X3 + 1), check that all codewords satisfy the theorem 4.7.
Solution:
The codewords in C can be written in their polynomial forms as follows:
C ¼ f0; 1 þ x; x þ x2 ; 1 þ x2 g:
Multiplying each codeword with h(x) gives the following (remember the mul-
tiplication is done modulo (X3 + 1)).
94 4 Reliability Enhancement Techniques for Physically …
This code will also have parity check polynomial (h(x)) such that:
gð xÞhð xÞ ¼ 1 þ xn 0ðmod 1 þ xn Þ
cð xÞhð xÞ 0ðmod 1 þ xn Þ
In this case, the coefficients of the generator polynomials are {g0, g1, g3, g3} =
{1,1,0,1}, replacing this in the structure presented in Fig. 4.2 produces the encoder
below.
4.4 The Design Principles of Error Correction Codes 95
To encode the message m = {1101}, first the register are reset, then the message
is inputted serially followed by n − k zeros (in this case 3 zeros). The codeword is
sampled at the output such that its most significant bit comes first. Let us study this
closely, the applied data bit is shown in red.
Stage 1:
The first bit of the data word (highlighted in red) is applied which generates the
coefficient of the largest term in the codewords.
Codeword: c(x) = x6
96 4 Reliability Enhancement Techniques for Physically …
Stage 2:
6
Codeword: c(x) = 0x5 + x
Stage 3:
Stage 4:
Stage 6:
Fig. 4.3 Generic hardware implementation of a syndrome computing circuit for cyclic codes
sðxÞ ¼ r ð xÞhð xÞ
This is an important example of a cyclic code that can correct up to three errors.
NASA used it in space missions in the early 80s and the new American government
standards for automatic link establishment in high-frequency radio systems. From
the PUF application viewpoint, this code offers the possibility of constructing a
multi-bit correcting error scheme at relatively low cost using the shift register-based
implementations explained in Sect. 4.4.7. Either of the following two polynomials
can generate this code:
4.4 The Design Principles of Error Correction Codes 99
g1 ð xÞ ¼ 1 þ X 2 þ X 4 þ X 5 þ X 6 þ X 10 þ X 11
g2 ð xÞ ¼ 1 þ X þ X 5 þ X 6 þ X 7 þ X 9 þ X 11
These are a special class of cyclic codes with the ability to correct more than one
error. They also have effective decoding procedures. They are named after Bose,
Ray-Chaudhuri and Hoquenghem.
Hardware implementation of the encoding and syndrome computing blocks of
the BCH codes are the same as that of cyclic codes seen previously. To understand
the mathematical constructions of these codes, we will need to introduce the
principles of Galois fields and primitive polynomials.
It is worth noting here that an irreducible polynomial of degree (n > 1) over a field
F does not have any roots in F.
Example 4.12 Which of the following polynomials is irreducible in Z2[x]?
f1 ð X Þ ¼ X 3 þ X
f2 ð X Þ ¼ X 3 þ 1
f3 ð X Þ ¼ X 3 þ X þ 1
Solution:
f1 ð X Þ is reducible as it has the root (0) in the binary field Z2
f2 ð X Þ is reducible as it has the root (1) in the binary field Z2
f3 ð X Þ is irreducible as it does not have roots in the binary field Z2 and cannot be
written as a product of two polynomial in Z2[x] of lesser degrees.
100 4 Reliability Enhancement Techniques for Physically …
Example 4.13 Construct a Galois field GF(22) and find its elements:
Solution:
p ¼ 2; n ¼ 2
Example 4.14 Consider the field GF(22) = Z2[x]/(1 + x + x2) = {0,1, X, X + 1}.
Given that r(x) = 1 + x + x2 is an irreducible polynomial of degree (n = 2)
over Z2, verify whether or not it is primitive.
4.4 The Design Principles of Error Correction Codes 101
Solution:
Let us assume that a is a root of r(x).
This means
a2 þ a þ 1 ¼ 0
a2 ¼ a þ 1
we can use the above equation to produce all the elements of GF(4) = {0,1, x, x + 1},
by computing the powers of a as follows:
a0 ¼ 1
a¼a
a2 ¼ a þ 1
a3 ¼ a:a2 ¼ að1 þ aÞ ¼ a2 þ a ¼ 1 þ a þ a ¼ 1
This shows that every nonzero element of GF(4) is a power of a, therefore r(X)
is a primitive polynomial according to Definition 4.13.
From example 4.14 we can see that a is a primitive element in GF(4).
BCH codes can be constructed using a generator polynomial as we will see in this
section, but first, we need to introduce the definition of minimal polynomials as
follows:
Example 4.15 Find the minimal polynomial of the elements of GF ð8Þ that is
constructed using the primitive polynomial x3 þ x þ 1
102 4 Reliability Enhancement Techniques for Physically …
GF ð8Þ ¼ 0; 1; x; x þ 1; x2 ; x2 þ 1; x2 þ x; x2 þ x þ 1:
Solutions:
First, we find the elements of this field. GF ð8Þ is constructed using the primitive
polynomial x3 þ x þ 1, which has a root (a), this means
a3 ¼ a þ 1
The nonzero elements of GF(8) are the powers of a (see Definition 4.16)
a0 ¼ 1
a1 ¼ a
a2 ¼ a2
a3 ¼ a þ 1
a4 ¼ a2 þ a
a5 ¼ a a4 ¼ a a2 þ a ¼ a3 þ a2 ¼ 1 þ a þ a2
a6 ¼ a a5 ¼ a þ a2 þ a3 ¼ 1 þ a þ a þ a2 ¼ 1 þ a2
a7 ¼ a a6 ¼ a þ a3 ¼ 1 þ a þ a ¼ 1
Now we are ready to define the generator polynomial of a binary BCH code as
follows:
Since the degree of each minimal polynomial is m or less, the degree of g(X) is
at most mt, this means the number of parity check digits, n − k, of the code is at
most equal to mt.
Example 4.16 Given the table below of the elements of GF(24) (generated by the
primitive polynomial pð xÞ ¼ x4 þ x þ 1 with their corresponding minimal polyno-
mials. Find the generator polynomial of the Binary BCH code capable of correcting
two errors.
Elements Minimal polynomial
0 x
1 xþ1
a; a2 ; a4 , a8 x4 þ x þ 1
a ; a ; a ,a
3 6 9 12
x4 þ x3 þ x2 þ 1
a5 ,a10 x2 þ x þ 1
a ;a ;a , a
7 11 13 14
x4 þ x3 þ 1
Solution:
In this case, m = 4 and t = 2, therefore, n ¼ 24
1 ¼ 15 according to
Definition 4.15.
Following the Definition 4.17, this code can be generated by
hð x Þ ¼ x 7 þ x 6 þ x 4 þ 1
The number of data bits k is equal to the degree of the parity check polynomial,
therefore k = 7.
This means that this is a cyclic code (n, k, t) = (15, 7, 2).
Solution:
The solution to Example 4.13 expresses the powers of a as f1; 1 þ x; xg, we use
the coefficients of these polynomials as the column vectors of a parity-check matrix,
this results in
1 1 0
H¼
0 1 1
4.4 The Design Principles of Error Correction Codes 105
We can use Definition 4.10 to deduce the parity check polynomial of this code
as follows:
hð x Þ ¼ 1 þ x
Definition 4.10 also states that hð xÞ:gð xÞ ¼ 1 þ xn we can use this to check our
solution. In this case, hð xÞ:gð xÞ ¼ ð1 þ xÞð1 þ x þ x2 Þ ¼ 1 þ x þ x2 þ x þ x2 þ x3 .
As the operation is done over the Galois Field GF(2m), which is an extension of
the binary field Z2 .
Then, (x þ x ¼ 0) and (x2 þ x2 ¼ 0) as this is a binary addition.
This means the above term can be rewritten as:
hð xÞ:gð xÞ ¼ 1 þ x3 ;
used to construct Hamming codes by adding extra rows. However, adding extra
rows does not necessarily add more restrictions. Assume, for example, we add a
second row whose elements are the squares of the corresponding elements of the
first row as shown below
a0 a1 . . . ai . . . aðn
1Þ
H¼ 0
a a2 . . . a2i . . . a2ðn
1Þ
Since, on squaring ðcðaÞÞ, the cross terms have coefficient 2 = 0 and therefore
vanish, while the coefficients ai ¼ 0; 1 in cð xÞ satisfy a2i ¼ ai . Hence, a2 is a root of
cð xÞ if and only if a is a root, which means adding this extra requirement does not
impose any new restrictions on the codewords, hence it is redundant.
Now, if we add a second row whose elements are the cubes of the corresponding
elements of the first row as shown below
106 4 Reliability Enhancement Techniques for Physically …
a0 a1 . . . ai . . . aðn
1Þ
H¼
a0 a3 . . . a3i . . . a3ðn
1Þ
S ¼ bHT ¼ cHT þ ek HT þ el HT ¼ ek HT þ el HT
This gives two equations with two unknowns (k, l), the receiver will be able to
locate and correct the two errors by solving the equations.
We are now going to generalise this construction of BCH codes as follows:
Example 4.18 The binary Hamming codes are BCH codes with b = 1 and designed
distance c ¼ 2, but in fact, they have minimum distance d = 3.
Example 4.19
Solution:
In this example, the degree of the primitive polynomial is 3, therefore m = 3 and
n = 23 − 1 = 15.
The parity-check matrix for the corresponding Hamming code will have the
following form:
H ¼ a0 a1 . . . a15
1 þ a þ a4 ¼ 0
a4 ¼ a þ 1
a0 ¼ 1
a1 ¼ a
a3 ¼ a3
a4 ¼ a þ 1
a5 ¼ a a4 ¼ aða þ 1Þ ¼ a2 þ a
108 4 Reliability Enhancement Techniques for Physically …
Similarly, we find:
a6 ¼ a3 þ a2
a7 ¼ a3 þ a þ 1
a8 ¼ a2 þ 1
a9 ¼ a3 þ a
a10 ¼ a2 þ a þ 1
a11 ¼ a3 þ a2 þ a
a12 ¼ a3 þ a2 þ a þ 1
a13 ¼ a3 þ a2 þ 1
a14 ¼ a3 þ 1
a15 ¼ a4 þ a ¼ 1 þ a þ a ¼ 1
We can now use the coefficients of the above polynomials corresponding to the
powers a0 ; a, a2 ; . . .; a14 to construct the following parity-check matrix. The
resulting matrix is given below.
2 3
1 0 0 0 1 0 0 1 1 0 1 0 1 1 1
60 1 0 0 1 1 0 1 0 1 1 1 1 0 07
6
H¼4 7
0 0 1 0 0 1 1 0 1 0 1 1 1 1 05
0 0 0 1 0 0 1 1 0 1 0 1 1 1 1
In order to extend this matrix to build a Binary BCH with a designed distance of
4, we will need to construct the following matrix according to Definition 4.19:
2 3
a0 a1 . . . a14
4
H ¼ a0 a2 . . . a28 5
a0 a3 . . . a42
From the above calculation of the powers of a we note that the powers of
a repeat with a period of 15, for example, a16 ¼ a15 :a1 ¼ a1 , we can use this to find
all powers up to a42 ¼ a15 :a15 :a12 ¼ a12 ¼ a3 þ a2 þ a þ a þ 1.
From the previous discussion, we noted that the second row of the matrix above
(i.e. the square is redundant), so no need to be included in the matrix.
This allows us to construct the required parity-check matrix by replacing the
powers of a with the coefficients of their corresponding polynomials, as follows:
4.4 The Design Principles of Error Correction Codes 109
2 3
1 0 0 0 1 0 0 1 1 0 1 0 1 1 1
60 1 0 0 1 1 0 1 0 1 1 1 1 0 07
6 7
60 0 1 0 0 1 1 0 1 0 1 1 1 1 07
6 7
60 0 0 1 0 0 1 1 0 1 0 1 1 1 17
H¼6
61
7
6 0 0 0 1 1 0 0 0 1 1 0 0 1 077
60 0 0 1 1 0 0 0 1 1 0 0 0 0 17
6 7
40 0 1 0 1 0 0 1 0 1 0 0 1 1 05
0 1 1 1 1 0 1 1 1 1 0 1 1 1 1
Assume a codeword y(x) is received which has two errors, which means it has a
Hamming distance of 2 from the closest valid codeword. Assume the positions of
these two errors are i1 ; i2 .
Now let
X
yð xÞ ¼ yX
i i
i
eð xÞ ¼ X i1 þ X i2
By definition:
Therefore:
Remember: gð xÞ is a divisor of all valid codewords c(x) and the roots of g(x) are
also roots of c(x).
110 4 Reliability Enhancement Techniques for Physically …
Now let
Xi ¼ ai
Xi1 ¼ ai1
Xi2 ¼ ai2
The equations are said to be power sum symmetric functions. Our task is to find
Xi1 and Xi2 , two non-zero and different elements in GF(16) satisfying the above
equations.
One approach is to find the product Xi1 Xi2 from these equations. In which case,
we will have knowledge of Xi1 þ Xi2 (which is S1 ) and Xi1 Xi2 , so that we can
construct the polynomial
ðXi1
zÞðXi2
zÞ ¼ z2
ðXi1 þ Xi2 Þz þ Xi1 Xi2
4.4 The Design Principles of Error Correction Codes 111
yð xÞ ¼ x4 þ x6 þ x7 þ x8 þ x13 ;
y ¼ ð0; 0; 0; 0; 1; 0; 1; 1; 1; 0; 0; 0; 0; 1; 0Þ:
Solution:
Let us first recall the powers of a
a0 ¼ 1
a1 ¼ a
a3 ¼ a3
a4 ¼ a þ 1
a5 ¼ þ a
a6 ¼ a3 þ a2
a7 ¼ a3 þ a þ 1
112 4 Reliability Enhancement Techniques for Physically …
a8 ¼ a2 þ 1
a9 ¼ a3 þ a
a10 ¼ a2 þ a þ 1
a11 ¼ a3 þ a2 þ a
a12 ¼ a3 þ a2 þ a þ 1
a13 ¼ a3 þ a2 þ 1
a14 ¼ a3 þ 1
We already know that a; a2 ; a4 , a8 , a3 ; a6 ; a9 and a12 are roots of g(x), let us first
write the syndrome equations for the first four powers of a by substituting the first
four powers of a in equation y(x) (remember a4 ¼ a þ 1), we can compute the
following syndrome equations:
S1 ¼ y ð aÞ ¼ a3 þ a2 ¼ a6
S2 ¼ y a2 ¼ a3 þ a2 þ a þ 1 ¼ a12
S3 ¼ y a3 ¼ a3 þ a þ 1a7 ¼ a7
S4 ¼ y a4 ¼ a 3 þ a ¼ a9
Y ¼ ð1; 0; 0; 0; 1; 0; 1; 1; 1; 0; 0; 0; 0; 0; 0Þ:
The block error probability refers to the probability of a decoding error (i.e. the
decoder picks the wrong codeword when applying the Hamming distance rule). For
a t-error correcting code of length n, this probability is given as follows:
Xt
n
PECC ¼ 1
Pie ð1
Pe Þn
i ; ð4:4Þ
i¼0 i
where Pe is the expected bit error rate (as explained in Sect. 4.3).
This formula can be used to determine the length of the error correction code
(n) to be used given the bit error rate of the PUF response and the desirable
4.4 The Design Principles of Error Correction Codes 113
decoding error (PECC ), the latter is typically in the order of 10−6 (see Chap. 6 for a
detailed example on how to use the above formula to select an appropriate error
correction code given a specific bit error rate).
The aim of this flow is to reduce the bit error rate of the PUF responses below a
specific level set by the intended application. For example, cryptographic key
generation schemes require precisely reproducible responses, whereas authentica-
tion protocols may be more tolerant to noise-induced errors. Figure 4.4 shows a
generic design flow of reliable PUF circuits.
The first stage is an optional pre-processing step in which the PUF chip is
purposely aged in order to limit progressive change in their output in the field [10,
11]. In the second stage, the least reliable bits of the PUF response are identified and
discarded, this stage is also not mandatory but it can help reduce the hardware cost
of the third stage. The latter reduces the error rate to an acceptable level using
generic fault tolerance schemes, such as temporal redundancy (e.g. using temporal
majority voters), information redundancy (error correction codes) or hardware
redundancy. The response obtained from the third stage does not typically have
acceptable entropy from a security perspective because of error correction and the
inherent biases of the PUF architecture. Therefore, a fourth stage may sometimes be
required to increase the entropy of the PUF responses [12]. Examples of privacy
amplifiers include hash functions and encryption cores. We will now give an
in-depth insight into the existing approaches applied at the first three stages of the
flow in Fig. 4.2.
The essence of these methods is to exploit the IC aging phenomenon (see Chap. 3
for more details) to reinforce desired PUF responses by permanently altering the
electrical characteristics of the PUF circuits; such reinforcement can help improve
the stability of the PUF responses and reduce its expected bit error rate. There are a
114 4 Reliability Enhancement Techniques for Physically …
We are going to use an SRAM PUF as an example to explain how this technique
can be applied for a SRAM cell (see Fig. 4.5) pull-up transistors, P1 and P2 suffer
from NBTI. The pull-down transistors (N1, N2) are affected by PBTI. P1 and P2
transistors are part a cross-coupled inverter, this means only one would be under
NBTI stress at any moment in time.
Such asymmetric stress conditions result in unbalanced threshold voltage
degradation of these two transistors, for example, if P1 is placed under greater stress
than P1, its threshold voltage degradation will be bigger compared to P2, which will
reduce its driving capability, consequently, node Q is less likely to be pulled up to a
logic 1 once the cell is powered up. This effect can be exploited to reinforce a
preferred value in the SRAM cell since the polarity of the resolved bit is a strong
function of the relative threshold voltages of the devices. This can be explained as
follows, each SRAM cell has a preferred initial state that it assumes when powered
up wherein the PMOS device with a lower threshold voltage is turned on. If we are
to store the opposite of this preferred value, the PMOS device with higher Vth
would be turned on, this means its driving capability gets weaker and its Vth
increases due to NBTI stress, which means the threshold voltage difference between
the two PMOS devices increases. This effectively means the SRAM cell will be
more capable to assume the same response every time it is powered up, which
makes the SRAM PUF more reliable.
An example of such effects is [13], wherein the authors have employed the static
noise margin to quantify the impact of BTI on SRAM.
Aging acceleration is not only applicable to SRAM PUF, in fact, the same
technique can be used to enhance the reliability of other PUF designs such as
delay-based architectures (e.g. Ring Oscillator PUF). This is because the rate of BTI
aging degradation is typically larger for fresh circuits, as the circuits age its
degradation slows downs [14, 15], which means the response of a PUF circuit is
unlikely to change after it has been aged. Therefore, one can deduce the following
generic procedure for this approach:
The use of NBTI aging acceleration has been estimated to achieve a sizable
reduction in the expected bit error rate of PUFs, the authors of [16] reported a 40%
improvement in the reliability of an SRAM PUF using this approach. In a more
recent study [11], Satpathy et al. reported a 22% decrease in the BER of an
SRAM PUF as a result of direct acceleration of BTI aging to reinforce the pre-
existing bias in the PUF cells and to inject a bias into the clock delay-path by aging
the buffers and the pre-charge transistors.
Despite its clear benefits, the BTI aging acceleration approach has a number of
drawbacks, first of all, the high temperature-based acceleration cannot be applied
selectively, therefore, other circuits on the chip would also age. More importantly,
BTI degradation is not permanent, as aging effects are partly reversed if the device
is placed in a recovery state [14]. There are a number of methods to overcome such
disadvantages. For example, the use of voltage-based aging acceleration can help
avoid the aging the whole chip by applying high voltage pulses selectively to the
PUF circuit. In addition, increasing the stress time can help achieve a permeant shift
in the threshold voltage [17].
116 4 Reliability Enhancement Techniques for Physically …
HCI is another aging mechanism, wherein traps are formed in the gate oxide of a
CMOS device. These traps are generated when channel carriers are accelerated to
gain a sufficient energy level to allow them to penetrate into the oxide. HCI mainly
affects NMOS devices as their charge carriers (i.e. electrons) has higher mobility
than those of the PMOS transistors (holes). Unlike BTI aging, HCI-induced
degradation is more permanent [18], therefore, it may produce more stable PUF
responses. An example on the use of this approach is reported in [10], wherein
authors incorporated additional control circuitry to each PUF architecture, which is
capable of inducing HCI stress conditions in the PUF device without affecting the
surrounding circuitry. This is achieved by applying a series of high voltage short
pulses onto selected transistors, which lead to an increase in the flow of high energy
current carriers over the CMOS channels, some of these will cause the formation of
traps in the gate oxide of the associated device and causes a permeant HCI-induced
threshold voltage shift. The authors reported a 20% decrease in the expected bit
error rate of the PUF response; this comes at the expense of a 10 folds increase in
the area of the PUF device due to the additional control circuitry. The use of HCI to
improve the reliability of PUF responses is an interesting concept but incurs sig-
nificant area overheads, therefore, more work is still needed to establish its efficacy
in practice.
The essence of this technique is to discard the least reliable bits in the PUF
responses, which reduces the need for expensive error correction circuitry. There
are many ways to implement such a technique, this primarily depends on the
specific features of the PUF architecture and the type of errors to be reduced [12]. In
general, this approach consists of two stages; first, the bit error rate associated with
each response bit is estimated (i.e. Pe ðri Þ), then a subset of the total response bits
are chosen, such that their respective error rates are below a certain threshold value.
A conceptual diagram of this method is shown in Fig. 4.6, where {r1… rn) represent
the original output bits of the PUF device and {rq… rm) represent the selected stable
bits.
We are going to discuss two examples of bit selection schemes, namely; index
bit masking (IBS) and stable-PUF-marking (SPM).
This approach was first proposed in [12] to reduce impact of environment noise
(e.g. Temperature and power supply variations) on ring oscillator PUFs. To explain
this method, let us consider the generic structure of an RO PUF depicted in Fig. 4.7.
4.7 Reliability Enhancement Using Stable Bits Selection 117
{r1, Pe(r1)}
PUF SEL
{r2, Pe(r2)} {rs, Pe(rs)}
Challenge
{rq, Pe(rq)}
{rn, Pe(rn)}
Chain R1 R2 R3 R4
Frequency (MHz) 100 106 97 110
Solution:
k:ðk
1Þ 4:ð4
1Þ
CRP ¼ ¼ ¼6
2 2
2. When IBS is applied, only ring oscillator pairs whose respective frequencies
differ by more than 5 MHz can be compared, this is to ensure no error will occur
even in the worst-case scenario, i.e. when the frequencies of two oscillators shift
in opposite directions.
This gives us a set of four allowed pairs: {(R1,R2), (R1,R4), (R2,R3) and
(R3,R4)}
The forbidden pairs are {(R1,R3),(R2,R4)}, which have a frequency difference
of 3 and 4 MHz, respectively.
Therefore, in this case, the number of challenge/response pairs is reduced to 4.
This preselection can be performed on-chip using additional control logic. The
authors of [12] presented a soft decision-based implementation of the selection
block, where indices are used to select the bits that are less likely to be noisy. Such
an approach does incur extra area overhead and it is important that such increase in
silicon costs does not exceed any potential area savings obtained by using a lighter
error correction code.
4.7.2 Stable-PUF-Marking
This is another stable bits selection scheme that was proposed for SRAM PUFs in
[20]. This approach consists of two stages: the purpose of the first stage is to
identify the most reliable cells in a SRAM PUF, in the second stage these cells are
4.7 Reliability Enhancement Using Stable Bits Selection 119
marked as stable, and the responses of the remaining unreliable cells are discarded.
This process takes place at the post-fabrication stage, where each PUF chip is
characterised and marked.
The challenge for this approach is how to identify stable cells, the most intuitive
approach is to perform multiple readings at different environmental conditions and
choose the bits that are most stable, however; this requires extensive chip testing,
which may not be feasible given the large volume of devices and strict time con-
straints in industrial chips production facilities. The authors in [20] proposed an
alternative approach to reduce the number of tests required to identify stable chips,
their solution exploits the cross-coupled transistors architecture present in the
SRAM cell. When the latter is powered up, the cell will assume a value of “0” or
“1” determined by the inherent difference in the driving capabilities of the
cross-coupled transistors (see Fig. 4.5 for a SRAM cell diagram). Crucially, the
larger the mismatch between these transistors, the faster the cell resolves to a stable
value. Therefore, if one can measure the time each cell takes to resolve to a stable
state, one can use this as an accurate indicator on the degree of mismatch. This fact
can be exploited to determine the cells whose mismatch is larger than a certain
threshold; indeed, the authors of [20] demonstrated that stable cells can be identified
with only two readings per SRAM cell which significantly reduces the overhead
incurred by time-consuming exhaustive testing.
Although pre-processing techniques and reliable bit selection schemes can help
improve the stability of PUFs responses, a post-processing stage is typically needed
to ensure the device operates correctly when deployed in the field. There are a
number of approaches for the generation of reliable bits responses, which are based
on conventional fault tolerance schemes: information redundancy, hardware
redundancy and temporal redundancy. In the first approach, a designer would need
to embed additional information into the device, normally referred to as “Helper
Data” in order to help correct potential noise-induced bit flips. In the second
approach, the PUF design area is extended to improve resiliency to errors. In the
third approach, some form of majority voting scheme is employed. Examples of
these techniques will be explained in the following subsections.
A secure sketch scheme is a process whereby a noisy version of a PUF response (r′)
can be recovered to its original value (r) using a helper data vector (w), it consists of
two randomised procedures “Sketch” for helper data generation and “Recover” for
response reconstruction, it is based on the use of error correction codes [21].
120 4 Reliability Enhancement Techniques for Physically …
r+c
Challenge Helper
Data r+c
Storage
1. Identify a suitable error correction codes C(n, k, t), such that n is equivalent to
the length of the PUF response r.
2. For each PUF device, generate a set of response/challenge pairs.
3. For each response, add a random codeword c 2 C to generate a helper data
vector unique to each challenge/response pair h ¼ r þ c.
4. Store the helper data vectors in the helper Data storage.
The second stage reconstructs the PUF responses using helper data vectors as
follows:
1. The PUF generates a response r′, which normally has an additive error
r 0 ¼ r þ e.
2. The helper data is then Xored with the erroneous response to generate:
h þ r 0 ¼ r þ c þ r ¼ c þ r 0 þ r ¼ c þ e0
3. If the Hamming distance between r and r 0 (wtH (e′) = dH (r, r′)) is within the
error correction capabilities of the code C(n, k, t), it will be possible to reproduce
the original PUF response (r), by first decoding (e′ + c) then, XORing the result
with the helper data (h).
This is another response reproduction method based on the use of error correction
codes [21]. It also consists of two phases; an enrolment phase followed by a
response construction stage. The enrolment phase takes place before field deploy-
ment of PUF devices and consists of the following steps:
The second stage reconstructs the PUF responses using helper data vectors as
follows:
r 0 ¼ c þ e0
r 0 ¼ c þ e0 ¼ r þ e þ e0
s0 ¼ r 0 HT ¼ ðc e0 Þ HT ¼ e0 HT
4. Add the generated syndrome to the corresponding helper data to generate a new
syndrome vector s00 ¼ s0 þ s ¼ e0 HT þ e HT ¼ ðe þ e0 ÞHT :
5. If the Hamming distance between r and r′ (wtH (e + e′) = dH (r, r′)) is within
the error correction capabilities of the code C(n, k, t), i.e. (wtH (e + e′) t), it
will be possible to reproduce the original PUF response (r), by identifying the
error vector ðe þ e0 Þ first then XORing the result with the response r 0 .
,
r
PUF ECC r
Block
e. HT
Challenge HT
Helper
Data
Storage
This is another reliable response reproduction method, which is based on the use of
time redundancy techniques wherein, an odd number of responses are generated for
the same challenge at different time intervals (t1, t2 ... tq), these are initially stored in
an on-chip memory. After sufficient number of responses are collected, they are
applied to a self-checking majority voter to choose the most reliable response as
depicted in Fig. 4.10 [11, 22]. As can be seen from the circuit diagram, this
technique requires additional hardware for response storage and control circuity.
This technique helps reduce error caused by transient failure mechanisms such as
temporal variations in voltage supply or radiation hits, however, error caused by
permanent physical failure mechanisms such as aging cannot be corrected, there-
fore, error correction may still be necessary. Equation (4.5) computes the error rate
ðPe ðoutÞÞ at the output of voters given an error rate ðPe ðinÞÞ at the input
q
1
Pe ðoutÞ ¼ 1
Binq Pe ðinÞ; ð4:5Þ
2
where
Binq is the cumulative distribution function of the binomial distribution.
r(t1)
P M V
r(t2)
U Stable
r(t1)…….r(tq) Response
Challenge F
r(tq)
The essence of this approach is to include extra circuitry to the PUF architecture to
improve its resilience against noise. One example of such an approach was pro-
posed in [1], wherein the design of the RO PUF was considered. The authors
proposed a scheme wherein each inverter chain is replaced by a configurable
structure which allows for multiple paths through the chain, for example, if we have
two configuration bits we can obtain four possible chains as depicted in Fig. 4.11.
The frequencies of the configurable chains are then compared and the config-
urations that produce the largest difference between each respective pair is stored as
part of the PUF challenge. This ensures that a maximum possible frequency gap is
obtained between each two inverter chains which makes the PUF response more
stable.
The cost of reliability enhancement techniques is divided into two categories; those
incurred by the extra procession steps at the post-fabrication stage and those
associated with the extra circuitry needed for a reliable response generation. The
latter is the more critical component in resources constrained devices such as RFID
tags.
The cost of error correction-based schemes is proportional to the number of
errors to be corrected. We are going to consider BCH codes as an example of such a
trend as these codes are one of the most widely used schemes for PUF crypto-
graphic key generation [23]. The area overhead of BCH decoding blocks with
different error correction capabilities is summarised in Fig. 4.12, based on area
estimation reported in [24] using a 65 nm CMOS technology.
The trend shown in Fig. 4.12 indicates that the cost of on-chip reliable response
generation increases significantly with the number of correctable errors (t). The data
shows that relationship between (t) and area cost is linear, this allows extrapolation
of the area costs for t > 7.
124 4 Reliability Enhancement Techniques for Physically …
Fig. 4.12 The area costs of the BCH decoders (n—the code length, t—correctable errors)
calculated for n = 256 BCH code needed in each case to reduce the error rate to
zero, for example, if the raw PUF output is used, a BCH ECC capable of correcting
22 errors will be needed to ensure reliable responses need for the cryptographic key
generation.
The results in Table 4.3 show that the error correction requirement can be
decreased by more than twice using pre-processing approaches. This can be further
reduced by more than four times if on-chip temporal majority voting is employed.
This decrease in (t) implies an equivalent reduction the area associated with the
BCH decoders (see sure Fig. 4.12).
Although the use of pre-processing approaches may incur additional costs at the
design and fabrication time, the analysis above shows that these techniques are very
effective in making the use of PUF technology a viable option for
resource-constrained systems.
The area and power costs of error correction schemes can be further decreased
using codes in parallel if the response of the PUF is long. This is achieved by
dividing the response into non-overlapping separate sections and decoding each
part separately. The partial coding approach has been adopted in [26] for low-cost
reliable PUF responses generation, it has also been proved cost-effective for other
area constrained reliable systems [27].
4.10 Conclusions
4.11 Problems
4:2. What is the basis of Cn = {00…0n, 11…1n} and what is the rate of this code?
Assuming n = 7, how many errors can this code correct?
4:3. You are given the irreducible polynomial r ð xÞ ¼ x3 þ x þ 1 defined over Z2
(a) Construct a Galois field GF (23) and find its elements using r ð xÞ.
(b) Is r(x) primitive and why?
4:4. A cyclic code C defined over a ring R6 = Z[X]/(X6 + 1) has the generator
polynomial g(x) = 1 + x + x2:
r1ð xÞ ¼ x3 þ x4 þ x5
r2ð xÞ ¼ 1 þ x4
4:5. Construct a syndrome computing circuit for a cyclic code with g(x) = 1 + x + x3.
Then, using your circuit verify whether or not the following received word is a
valid codeword: r(x) = 1 + x2 + x5 + x6.
4:6. Construct a Hamming code with length n = 3.
4:7. Use the primitive polynomial 1 + x + x4 to construct a parity-check matrix for
the binary Hamming code H15.
4:8. Calculate the minimum number of raw PUF response bits needed to generate a
128-bit stable response given using a code-offset scheme based on a BCH
decoder. It is assumed that the estimated error rate of the PUF raw output is 8%
and the decoding error of the BCH scheme scheme should not exceed 10−6.
4:9. A code-offset scheme, based on Hamming code C (n, k, t) = (15, 11, 7) from
Example 4.7, is used for reliable response generation for a PUF whose
response length is 15 bits and maximum bit error rate is 5%.
4.11 Problems 127
Table 4.5 List of reliable response generation schemes with their associated area overheads
Technique type Area (um2) Energy dissipation Bit error rate
(pJ) reduction
Code-offset scheme 15,130 13.2 6
Syndrome coding 16,320 15.1 7
Temporal majority voting 9750 6.4 5
Hardware redundancy 1120 4.5 3
(a) Explain whether or not the above reliable response generation scheme is
suitable for cryptographic key generation.
(b) Compute a helper data vector for the following PUF response r =
(000110111011101).
4:10. Consider a ring oscillator PUF that has five inverter chains whose respective
frequencies are shown in Table 4.4.
Assume that the maximum variation in frequency due to noise is ±4 MHz.
(a) Calculate the total number of challenge/response pairs for this PUF.
(b) IBS approach was applied to this PUF such that the number of bit flips
caused by noise is reduced to zero. Recompute the number of
challenge/response pairs for this reliability-enhanced PUF.
4:11 Table 4.5 includes a list of reliable response generation schemes with their
associated area overheads, power consumption and the reduction they can
achieve in the bit error rate of a PUF response.
It is assumed that the expected bit error rate of the PUF output is 9%. Study
the specification of the following two designs carefully and propose a tech-
nique or a combination of techniques for reliable PUF response generation in
each case
References
1. A. Maiti, P. Schaumont, The impact of aging on a physical unclonable function. IEEE Trans.
Very Large Scale Integr. VLSI Syst. 22, 1854–1864 (2014)
2. M.S. Mispan, B. Halak, M. Zwolinski, NBTI aging evaluation of PUF-based differential
architectures, in 2016 IEEE 22nd International Symposium on On-Line Testing and Robust
System Design (IOLTS) (2016), pp. 103–108
3. D. Ganta, L. Nazhandali, Study of IC aging on ring oscillator physical unclonable functions,
in Fifteenth International Symposium on Quality Electronic Design (2014), pp. 461–466
4. C.R. Chaudhuri, F. Amsaad, M. Niamat, Impact of temporal variations on the performance
and reliability of configurable ring oscillator PUF, in 2016 IEEE National Aerospace and
Electronics Conference (NAECON) and Ohio Innovation Summit (OIS) (2016), pp. 458–463
5. B. Halak, A. Yakovlev, Statistical analysis of crosstalk-induced errors for on-chip
interconnects. IET Comput. Digital Tech. 5, 104–112 (2011)
6. G.I. Zebrev, A.M. Galimov, Compact modeling and simulation of heavy ion induced soft
error rate in space environment: principles and validation. IEEE Trans. Nucl. Sci. 1–1 (2017)
7. V. Vargas, P. Ramos, V. Ray, C. Jalier, R. Stevens, B.D.D. Dinechin et al., Radiation
experiments on a 28 nm single-chip many-core processor and SEU error-rate prediction. IEEE
Trans. Nucl. Sci. 64, 483–490 (2017)
8. C. Yanni, K.K. Parhi, Small area parallel Chien search architectures for long BCH codes.
IEEE Trans. Very Large Scale Integr. VLSI Syst. 12, 545–549 (2004)
9. S.Y. Wong, C. Chen, Q.M.J. Wu, Low power Chien search for BCH decoder using RT-level
power management. IEEE Trans. Very Large Scale Integr. VLSI Syst. 19, 338–341 (2011)
10. M. Bhargava, K. Mai, A high reliability PUF using hot carrier injection based response
reinforcement, in presented at the Proceedings of the 15th International Conference on
Cryptographic Hardware and Embedded Systems, Santa Barbara, CA (2013)
11. S. Satpathy, S. Mathew, V. Suresh, R. Krishnamurthy, Ultra-low energy security circuits for
IoT applications, in 2016 IEEE 34th International Conference on Computer Design (ICCD)
(2016), pp. 682–685
12. M.D. Yu, S. Devadas, Secure and robust error correction for physical unclonable functions.
IEEE Des. Test Comput. 27, 48–65 (2010)
13. M. Cortez, A. Dargar, S. Hamdioui, G.J. Schrijen, Modeling SRAM start-up behavior for
physical unclonable functions, in 2012 IEEE International Symposium on Defect and Fault
Tolerance in VLSI and Nanotechnology Systems (DFT) (2012), pp. 1–6
14. B.C. Paul, K. Kunhyuk, H. Kufluoglu, M.A. Alam, K. Roy, Impact of NBTI on the temporal
performance degradation of digital circuits. IEEE Electron Device Lett. 26, 560–562 (2005)
15. H.K.M.A. Alam, D. Varghese, S. Mahapatra, A comprehensive model for PMOS NBTI
degradation: recent progress. Microelectron. Reliab. 47, 853–862 (2007)
16. M. Bhargava, C. Cakir, K. Mai, Reliability enhancement of bi-stable PUFs in 65 nm bulk
CMOS, in 2012 IEEE International Symposium on Hardware-Oriented Security and Trust
(2012), pp. 25–30
17. G. Pobegen, T. Aichinger, M. Nelhiebel, T. Grasser, Understanding temperature acceleration
for NBTI, in 2011 International Electron Devices Meeting (2011), pp. 27.3.1–27.3.4
18. X. Li, J. Qin, J.B. Bernstein, Compact modeling of MOSFET wearout mechanisms for
circuit-reliability simulation. IEEE Trans. Device Mater. Reliab. 8, 98–121 (2008)
19. G.E. Suh, S. Devadas, Physical unclonable functions for device authentication and secret key
generation, in 2007 44th ACM/IEEE Design Automation Conference (2007), pp. 9–14
20. M. Hofer, C. Boehm, An alternative to error correction for SRAM-like PUFs, in Proceedings
of Cryptographic Hardware and Embedded Systems, CHES 2010: 12th International
Workshop, Santa Barbara, USA, 17–20 August 2010, ed. by S. Mangard, F.-X. Standaert
(Springer, Berlin, 2010), pp. 335–350
21. Y. Dodis, L. Reyzin, A. Smith, Fuzzy extractors: how to generate strong keys from biometrics
and other noisy data, in Proceedings of Advances in Cryptology - EUROCRYPT 2004:
References 129
5.1 Introduction
Let us take symmetric (shared key) ciphers as an example, in this case, there are
a number of known attacks against these primitives, including but not limited to
brute force key search, differential fault analysis and side channel attacks. In
addition, a symmetric encryption scheme has a number of quality metrics that can
be assessed such as the size of their key space and their avalanche effect. The latter
is a measure of the average number of the output bits that change when on input bit
changes.
132 5 Security Attacks on Physically Unclonable Functions …
The advanced encryption standard (AES) [5] is considered to be one of the most
secure symmetric encryption algorithms due to its proven resilience to all known
attacks. It also has a number of desirable security properties such as a strong
avalanche effect, which means a significant number of its ciphertext output bits
change if one bit changes in its plaintext input or key, this means it is very hard to
find a statistical correlation between the ciphertext and the plaintext. AES algorithm
has a massive key space, which makes it unfeasible to find the encryption key using
brute force search.
The same evaluation approach can also be applied to assess the security of PUF
designs, this means, one needs to have an understanding of all known attacks on
PUFs, and develop a number of metrics and benchmarks which can be used to
evaluate and test the security-related features of a PUF design.
This chapter aims to:
It is hoped that this chapter will give the reader the necessary theoretical
background and skills to understand PUF security attacks, evaluate the suitability of
a PUF design with respect to these threats and develop appropriate
countermeasures.
The organisation of this chapter is as follows: Sect. 5.3 discusses the design
qualities one should consider when evaluating the security of a PUF design.
In Sect. 5.4, we look in details into the metrics that quantify these qualities.
Section 5.5 classifies the types of adversaries commonly encountered in the
context of PUF security. Sections 5.6, 5.7 and 5.8 explain the principles of ma-
chine learning algorithms and how these can be employed to wage mathematical
cloning attacks; this is followed by a detailed discussion of suitable counter-
measures in Sect. 5.9. Sections 5.10 and 5.11 discuss existing side channel
attacks and their related countermeasures, respectively. Section 5.12 explains the
principles of physical attacks; followed by a discussion of possible defence
mechanisms in Sect. 5.13. A comparative analysis of existing PUF attacks and
associated countermeasures is presented in Sect. 5.14; this is followed by con-
clusions and lessoned learned in Sect. 5.15. A list of problems and exercises is
provided in Sect. 5.16.
5.3 What Is a Secure PUF? 133
Since its conception more than a decade ago [6], the PUF technology has seen a
dramatic increase in the number of new implementations as we have seen in
Chap. 2.
The question is how to choose a suitable PUF for a particular application, in
other words, what criteria should a designer be looking for in a PUF when con-
structing a secure system.
In principle, there are four fundamental properties, which allow for an objective
comparison of the security of different PUF designs, namely:
• Randomness ensures the balance between 0 and 1 in the PUF responses, this
makes it harder for an adversary to deduce information about the PUF by
observing its output.
• Physical Unclonability ensures that an adversary is not able to create another
physical instance of a PUF which has identical challenge/response behaviour.
• Unpredictability ensures that an adversary is not able to predict the PUF
response to a new challenge, even if he had access to a large set of previous
challenge/response pairs. This property can also be referred to as Mathematical
Unclonability.
• Uniqueness ensures that each PUF instance has a unique behaviour, such that an
adversary, who manages to wage a successful attack on a PUF instance, will not
be able to predict the behaviour of other instances of the same PUF implemented
on different devices.
The relative importance of these qualities may be different from one application
to another. For example in key generation schemes, Uniqueness is the most
important in order to ensure that different devices have distinct derived keys,
whereas in authentication schemes, unpredictability is more significant in order to
prevent the construction of a behavioural model by an adversary snooping on the
communication between a PUF and an authenticating server. On the other hand, the
use of PUF as a true random bit generator (TRNG) requires a design that has the
maximum possible randomness.
In addition, a good understanding of the intended application is also important to
help identify the most likely security threats. For example, in wireless sensor net-
works, devices may be deployed in the field without any physical protection,
therefore, reverse engineering and side channel analysis are very likely risks, this
may not be true for large data servers normally located in well protected locations.
From the above discussion, we can conclude that in order to answer the question
‘how secure a PUF is’, we first need to understand their intended application, this
can help to identify the most important design qualities to consider and the most
likely threat model. In addition, we need to evaluate their robustness against known
attacks. Figure 5.1 depicts a visual illustration of these three elements.
134 5 Security Attacks on Physically Unclonable Functions …
This section gives formal definitions to the security properties discussed above and
presents the metrics which can be used to quantify these qualities.
5.4.1 Notation
We will use the following notations in the rest of this chapter. A PUF implemen-
tation is referred to using a letter I. Challenge and responses are referred to the
letters C and R, respectively. Each PUF is considered to have n inputs and m out-
puts. Adversaries are referred to using the letter A.
5.4.2 Randomness
Consider a PUF device I which has a set of valid challenges CV and their corre-
sponding valid responses RV .
The PUF I is said to have a random response if the probability of an output ‘1’
is the same as that of an output ‘0’.
5.4 Security Evaluation Metrics for PUF 135
(a) Uniformity
Uniformity estimates the percentage of ‘1’s and ‘0’s in a PUF response [7], it can
be computed as follows:
1 X
Uniformity ¼ HW ðri Þ 100% ð5:1Þ
m j Rv j r2Rv
where jRv j, m are the total number of responses and their bit length, respectively.
HWðri Þ is the Hamming weight of the ith response.
A truly random variable will have an equal percentage of ‘1’s and ‘0’s, therefore
a uniformity of 50%.
(c) Entropy
A more generic approach is the use of entropy as defined by Shannon [10, 11], it
is commonly used in cryptographic applications to quantify how unpredictable a
given discreet random variable is.
Let X(r) be a random binary variable representing all responses. The entropy
(HðXÞ) is computed as follows:
X
HðXÞ , r2X
ðpðrÞ log2 pðrÞÞ ð5:2Þ
where
r 2 f0; 1g
pðrÞ is the probability of a response ðrÞ.
In this context, one can use the min-entropy, introduced by Renyi [12], which is
a more pessimistic notion of Shannon entropy, it is given as follows:
Consider a PUF device I which has a set of valid challenges CV and their corre-
sponding valid responses RV .
Let IC be a physical clone of I, created by an adversary A. Assume IC can
generate a set of responses RC for the same set of challenges CV .
The PUF I is said to be physically unclonable if the average Hamming distance
between the elements of RV and their corresponding counterparts in RC is signif-
icantly larger than the average intra-Hamming distance between the responses to
the same challenge HDINTRA , taken at different environmental/noise conditions.
This can be mathematically expressed using the following equation:
1 X
dðrc; rvÞ HDINTRA 8f Ag ð5:4Þ
jCv j c2Cv
where
rv; rc are the responses generated by the device and its clone respectively, for the
same challenge c.
jCv j is the total number of valid challenges.
In other words, the difference between the responses generated by a clone should
be easily distinguished from those created by an authentic device, and the difference
between the two cannot be mistakenly attributed to the errors caused by noise or
environmental variations.
By definition, all PUFs have to be physically unclonable, but recent advances in
reverse engineering techniques have made physical cloning a possibility in some
cases [3, 13].
There is no single metric that can be used to evaluate how physically unclonable
a PUF is, but Eq. (5.4) can be used to examine how successful a physical cloning is.
Consider a PUF device I that has a set of valid challenges CV and their corre-
sponding valid responses RV . Consider an Adversary A who has access to a large
number of challenges CA and their associated responses RA .
Let Cp denote a set of valid challenges whose corresponding responses are
unknown to Adversary A.
Let rp denote the response predicted by Adversary A to a challenge (c 2 Cp )
based on his previous knowledge, i.e. {CA , RA }, and rv denotes the valid response
of the same challenge.
The PUF I is said to be unpredictable if the average Hamming distance between rp
and rv is significantly larger than the average intra-Hamming distance between the
responses to the same challenge HDINTRA , taken at different environmental/noise
conditions.
5.4 Security Evaluation Metrics for PUF 137
1 X
dðrp; rvÞ HDINTRA 8f Ag ð5:5Þ
Cp c2Cp
This means the average bit error in the predicted responses is significantly larger
than that caused by noise, in other words, it is not possible for any adversary to
construct an accurate mathematical model of the PUF even if he has access to large
number of challenge/response pairs.
There are a number of approaches to evaluate the Unpredictability as follows:
The above equations quantify the average and minimum number of bits of rp
that cannot be predicted by an adversary even if he has knowledge of ra.
The use of the conditional entropy metric as described above may not be feasible
in practice due to the large size of the challenge/response space. In addition, the
computation of the conditional probability pðrp=raÞ is not straightforward, this is
because both rp; ra are generated from the same PUF, so cannot be considered
independent. The dependency between rp and ra can only be established if detailed
knowledge of the physical implementation is known, and some simplifying
assumptions are made. Examples of the use of this metric can be found in [14, 15].
In order to evaluate the Unpredictability of a PUF design, one can evaluate how
‘learnable’ the statistical relationship between the challenges and the responses is
using machine learning algorithms [16]. To do this, one has to obtain a large
number of /challenge/response pairs, some of these data are used in the ‘training’
138 5 Security Attacks on Physically Unclonable Functions …
stage to construct a software model of the PUF. The latter is then verified in the
‘validation’ stage using the remainder of the challenge/response pairs.
Once a model is created, the prediction error is computed by comparing the
average Hamming distance between the predicted responses and those given by the
model for the same set of challenges.
1 X
lðHDTðt; eÞÞ ¼ HDTðt; e; cÞ 0:5 ð5:9Þ
jC j c2C
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P ffi
ðHDTðt;e;cÞlÞ2
c2C
rðHDTðt; eÞÞ jC j1
¼ 0 ð5:10Þ
lðHDTðt; eÞÞ lðHDTðt; eÞÞ
jCj is the total number of challenges being considered for this test.
In most PUF designs, the responses of two challenges with a small Hamming
distance have some dependencies. For example, in an SRAM PUF, wherein an
address is used as a challenge, two closely related challenges may point to two
memory cells with a close physical proximity. Therefore, the dimensions of these
cells are likely to be less affected by intra-die process variations, hence their
respective transistors threshold biases can be very similar, consequently their cor-
responding start-up values (i.e. responses) can be identical, which means the HDT
metric in this case is likely to be 0.
5.4 Security Evaluation Metrics for PUF 139
Another example, wherein two challenges with small Hamming distance have
dependencies, is the arbiter PUF. In this case, such challenges may result in highly
similar choice of signal paths, which may lead to identical responses.
Therefore in such cases, designers are likely to have a more realistic evaluation of
the Unpredictability of the PUF if they use t ¼ 1, to compute the HDTðe; tÞ metric.
5.4.5 Uniqueness
Consider a PUF device I that has a set of valid challenges CV and their corre-
sponding valid responses RV .
A PUF is considered to be unique if it can be easily distinguished from other
PUF with the same structure implemented on different chips in the same
technology.
This metric is evaluated using the ‘Inter-chip Hamming Distance’. If two chips,
i and j (i ¼
6 j), have n-bit responses, Ri(n) and Rj(n), respectively, for the challenge c,
the average inter-chip HD among k chips is defined as [7]
2 Xk1 Xk HD Ri ðnÞ; Rj ðnÞ
HDINTER ¼ 100% ð5:11Þ
kðk 1Þ i¼1 j¼i þ 1 n
This section presents an exemplar case study of how to apply the above metrics to
evaluate the security of a PUF design for a specific application.
Example 5.1 Table 5.1 shows the challenge/response pairs of a PUF design (n = 5,
m = 1).
Solution:
1X k
1 X32
Uniformity ¼ ri 100% ¼ ð0 þ 0 þ 1 þ Þ 100% ¼ 43%
k i¼1 32 i¼1
5.4 Security Evaluation Metrics for PUF 141
(c) The uniformity of this PUF is 43%, which indicates its responses are more
likely to be ‘0’, in addition, the entropy of the responses is smaller than that of
a truly random bit stream, which is 1,
The results from (a) and (b) both indicate that this design does not produce a
random output, hence cannot be used as a random number generator.
(d) To evaluate HDT ðt ¼ 1; eÞ, we need to identify for each challenge cj , the
related challenges Ci such that d cj ; ci ¼ 1; 8ci 2 Ci .
Then, we calculate the Hamming distance between each two respective
responses (rj ri ).
The HDTð1; eÞ for each challenge (rj ) is then computed as the average
Hamming weight of ðrj ri Þ where ci 2 Ci . The results are summarised in
Table 5.2.
Based on the results above, we can now compute the mean and standard devi-
ation for the HDT ð1; eÞ using Eqs. (5.9) and (5.10) discussed earlier, in this case:
1 X
lðHDTð1; eÞÞ ¼ HDTð1; e; cÞ ¼ 0:43
jC j c2C
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P ffi
ðHDTðt;e;cÞlÞ2
c2C
rðHDTðt; eÞÞ jC j1 0:387
¼ ¼ ¼ 0:9
lðHDTðt; eÞÞ lðHDTðt; eÞÞ 0:43
Although the mean is close to the ideal value of 0.5, the standard deviation is too
large, which indicates that there are overall strong correlation between the responses
of challenges with Hamming distance of ‘1’, especially in the case of the second,
third and fifth challenges, where a response of a neighbouring challenge can be
predicted with a 100% accuracy.
(e) The results from step (d) show that there is a dependency between the
responses of closely related challenges. An adversary can exploit this to
predict responses to unknown challenges using previously recorded responses
of known challenges. Therefore, this design is not suitable for a device au-
thentication scheme. This is because it is possible for an attacker to guess an
authentic response to a presented challenge with high probability if he has
recorded responses to a closely related challenge, which makes it possible for
this adversary to authenticate fake devices.
(f) The uniqueness of this PUF is very close to the ideal value, which makes it a
suitable candidate for a key generation scheme, as it allows generating distinct
keys for different devices.
The notion of the security of cryptographic systems can only be discussed with
respect to a particular adversary; therefore it is important to understand the
assumptions made about the capabilities of the attackers in terms of skills, physical
access and or technical knowledge about the design.
In the context of PUF technology, there are three main types of attackers who are
classified according their assumed knowledge and capabilities, as follows.
In this case, the attacker is assumed to be able to gain knowledge of the layout and
logic structure of the silicon chip incorporating the PUF. This adversary is assumed
to be capable of all known semi-invasive and invasive attacks including, reverse
engineering, fault injections and micro-probing.
This threat level can be a likely possibility due the multinational nature of chip
production chains these days. In fact, the vast majority of silicon chips are fabri-
cated and tested in untrusted factories and testing facilities, this makes it easier for
adversaries to acquire information about the physical architecture and functionality
of designed chips [18].
144 5 Security Attacks on Physically Unclonable Functions …
Start
(1)
Obtain a large number of CRPs, than divide the data into a
training set and a test set.
(2)
Construct an internal, parametric model of the PUF.
(3)
Run the machine-learning algorithm using the training data set to
obtain a mathema cal clone
(4)
Cross-validate your model using the test data set
(5) NO
Acceptable Predic on Error?
YES
Finish
Fig. 5.2 A generic procedure for modelling physically unclonable functions using machine
learning algorithms
146 5 Security Attacks on Physically Unclonable Functions …
in multidimensional space with the value of each feature being the value of a
particular coordinate. Then, classification is performed by finding the hyperplane
that differentiates the two classes of data. Let us study the problem of classifying
features vectors in the case of a two-dimensional real space as shown in Fig. 5.3.
The aim of the SVM method is to design a hyperplane that classifies the two data
sets correctly. For the data shown in Fig. 5.3, this can be done by introducing a
linear hyperplane as follows:
y ¼ ax þ b ð5:12Þ
Let us assume w1; w2 are the corresponding distances from this hyperplane to
the closest point in each data set.
To increase the accuracy of the classification, the term w1 þ1 w2 must be min-
imized, in addition, the number of misclassification must be minimised. These two
conditions are equivalent to minimising the following cost function:
where
M is the number of misclassified points.
i is a positive integer.
In this section, we are going to learn the basic working principles of SVM using
illustrative examples.
5.7 Mathematical Cloning Using Support Vector Machine 147
Example 5.2 In Fig. 5.4, there are two sets of data (diamonds and hearts); in
addition, there are three possible lines which can be used to separate these sets.
Study the data carefully and specify which of these lines is the best option?
In this case, we use a rule of thumb to identify the right hyperplane, which
intuitively states ‘the better solution achieves better data segregation’. In this sce-
nario, solution ‘B’ is certainly the best option, as it can achieve a clear separation
between the two sets.
The above rule of thumbs helps to reduce the number of misclassifications.
Example 5.3 Figure 5.5 shows three possible separating lines, which of these is the
best option?
In this case, applying the previous rule of thumb does not help as all lines can
segregate the data classes perfectly. In this case, we need to use a second rule of
thumb, which states ‘the best solution maximises the distances between the nearest
data points (from either class) and the separating line’.
In this case, line B has larger distances from the points of both data sets com-
pared to both A and C, therefore it is the best option.
This rule of thumb helps to allow for extra margin for error, which can help
reduce the number of misclassifications.
Example 5.4 Figure 5.6 shows two possible separating lines which segregate two
sets of data, which of these is the best option?
SVM selects the solution that classifies the classes accurately over those that
maximise the minimum distances to the data points. Here, line A has a classification
error and B classifies all points correctly. Therefore, we choose B.
Example 5.5 How to separate the two classes of data in Fig. 5.7?
In this case, it is not possible to find a solution using a straight line, as one of the
heart points is located in a very close proximity to the diamonds. Such points are
typically referred to as outliers. The latter are normally ignored by SVM during the
learning process, otherwise a model cannot be extracted. In this example, the
solution is shown in Fig. 5.8.
5.7.3 Kernels
In the previous examples, we have seen that data can be easily separated by a linear
hyperplane; however, this may not be always the case. This section introduces the
5.7 Mathematical Cloning Using Support Vector Machine 149
concept of Kernels, it also explains how these can help solve more complex
classification problems.
Example 5.6 How to separate the two classes of data in Fig. 5.9?
This problem is typically resolved by SVM by introducing additional features.
First we note from Fig. 5.9, the hearts are closer to the origin of x and y axes
than the diamonds are, based on this observation, we will add a new feature
ðz ¼ x2 þ y2 Þ. Figure 5.10 shows a plot of the data points using axis x and z.
The addition of this new feature made it very easy to separate these two classes
of data. This is typically referred to as the kernel trick. The latter consists of taking
low dimensional input space and transform it to a higher dimensional space, these
functions are called kernels, they are useful in nonlinear separation problems. The
separation line in original input space looks like a circle (Fig. 5.11).
SVM functions are available in most mathematical simulation tools (e.g MATLAB)
as library functions; it is important to understand how to define the parameters of
these models in order to obtain accurate results. There are three parameters, which
need to be set when we create an SVM classification object, namely: kernel, gamma
and C. Table 5.3 summarises some available options for each of these parameters
and their impact on the model.
5.7.5 Cross-Validation
3. Estimate the prediction errors by comparing the real outputs from the test set and
the predicted outputs from the model
4. Repeat the procedure k times, each time with a new training/test set
Now, we have a better understanding of how SVM works, we are going to learn
how to use it to model the behaviour of an arbiter PUF. A MATLAB toolbox is
used to perform the analysis.
An arbiter PUF consists of k stages; each stage is composed of two 2-to-1
multiplexers as shown in Fig. 5.12. A rising pulse at the input propagates through
two nominally identical delay paths. The paths for the input pulse are controlled by
the switching elements, which are set by the bits of the challenge as shown in
Fig. 5.13. For c = 0, the paths go straight through, while for c = 1 they are crossed.
Because of manufacturing variations, however, there is a delay difference Δt
between the paths. An arbiter at the end generates a response ‘0’ or ‘1’ depending
on the difference in arrival times. In our simulation, a set–reset (SR) latch has been
used as the arbiter block. We will now explain how to implement the modelling
procedure described previously in Fig. 5.2.
152 5 Security Attacks on Physically Unclonable Functions …
0 0 0
... 0
1 1
S Q Response
1 1
1 1 1
... 1
R
0 0 0 0
Arbiter
Challenge
Ci=0 Ci=1
A 16-bit arbiter PUF has been implemented in a 65-nm technology node and
simulated using the BSIM4 (V4.5) transistor model with a nominal supply voltage
1.2 V and a room temperature 25 °C. Intrinsic variations such as effective length,
effective width, oxide thickness and threshold voltage are modelled in Monte Carlo
simulations using the built-in fabrication standard statistical variation (3r varia-
tions) in the technology design kit. A total of 15000 CRPs have been generated for
later analysis.
The functionality of the arbiter PUF can be described by an additive linear model
[6]. The total delays of both paths are modelled as the sum of the delays in each
stage (switch component) depending on the challenge C (c1, c2…ck). The final
delay difference Δt between the two paths in a k-bit arbiter PUF can be expressed as
~T~
Dt ¼ D v ð5:14Þ
uncrossed (0) paths, respectively. Hence, d1i is the delay of stage i when ci = 1, and
d0i is the delay of stage i when ci = 0. Then
~T ¼ D1 ; D2 . . .; DK ; DK þ 1 T
D ð5:15Þ
where
d11 þ d01
D1 ¼ 2
d1i1 þ d0i1 þ d1i þ d0i
D ¼
i
2 ; i ¼ 2. . .k:
kþ1 d1k þ d0k
D ¼ 2
In addition, we can write the feature vector ~
v as follows:
! T
vðCÞ ¼ v1 ðCÞ; v2 ðCÞ. . .; vK ðCÞ; 1 ð5:16Þ
where
Yk
v1 ðCÞ ¼ i¼j
ð1 2Ci Þ; j ¼ 2. . .:k:
From Eq. 5.14, the vector D~T encodes the delay in each stage of the arbiter PUF
and via D~T~
v ¼ 0 we can determine the separating hyperplane in the space of all
feature vectors ~
v.The delay difference, Dt, is the inner product of D ~T and ! v . If
Dt > 0, the response bit is ‘1’, otherwise, the response bit is ‘0’. Determination of
this hyperplane allows building a mathematical clone of the arbiter PUF.
Step 3: Running support vector machine algorithm using a training CRPs set
This section gives an example of how to use ANN to model the behaviour of a
PUF.
A typical structure of ANN is a feedforward network which can be constructed
as single-layer perceptron (SLP) or multilayer perceptron (MLP). MLP has been
considered in this study as it can solve a nonlinear problem. The MLP network
consists of three layers of nodes which include an input layer, a hidden layer and an
output layer. Except for the input layer, in each neuron, all input vector values are
weighted, summed, biased and applied to an activation function to generate an
output. A tan-sigmoid and linear transfer functions have been used for hidden and
output layers, respectively. Using 32 neurons in the hidden layer is found to be an
optimum setting. During training, an error resulting from the difference between a
predicted and observed value is propagated back through the network and the
weight and bias of the neurons are adjusted and updated. The training process stops
when the prediction error reaches a predefined value or a predetermined number of
epochs are completed. Based on our experiments, resilient backpropagation has
been chosen as the best training algorithm considering the prediction accuracy and
fast convergence time.
The above ANN has been implemented in MATLAB. In this case, we consider a
32-bit TCO PUF [23]. A set of 20000 CRPs are obtained by simulating the TCO PUF
using 65 nm CMOS technology. 18000 CRPs were used for training and 2000 CRPs
were utilised for testing. The results are shown in Fig. 5.15. Readers are referred to
Appendix C for an exemplar MATLAB script of a ANN modelling attack.
Defence mechanisms against modelling attacks can be classified into two types as
follows:
156 5 Security Attacks on Physically Unclonable Functions …
Fig. 5.15 Prediction accuracy of ANN model versus number of CRPs for a TCO PUF
(a) Prevention techniques that aim to stop an adversary from collecting enough
challenge/response pairs to build a mathematical model of the PUF.
(b) Mitigation methods that aim to enhance the resilience of PUF designs against
mathematical cloning using additional cryptographic primitives (e.g. Hash
functions, encryption cores, etc.).
This is another preventative approach, its core principle is to partially mask the
challenges (or responses) applied to (or generated from) the PUF. This partial
masking approach makes it harder for machine learning algorithms to build an
accurate model of the PUF.
One example of partial masking of multibit responses is shown [24] in the
context of PUF authentication protocols. In this example, only a subset of PUF
response strings is sent to the verifier during the authentication process, the veri-
fying server uses a substring matching technique to validate the responses.
Another approach has been implemented in [25], wherein a partial challenge is
sent by the verifier to the prover device embedding the PUF. The received short
challenge is subsequently padded with a random pattern generated by a random
number generator to make up a full-length challenge before being applied to the
PUF; the server uses a challenge recovery mechanism to generate an emulated
response to compare with the received response.
Both of the above approaches might increase the authentication time as well as
consume more computing resources on the verifier side. One might argue, however,
this may not be not a big concern since the verifier has always been assumed to be
rich in resources.
This preventative approach was presented in [26], its principles are as follows: the
challenges are divided into two subsets: valid and invalid, the former are called the
secret challenges or (the s-challenges), the number of valid challenges in this set
should not be sufficient to allow building a mathematical model. If we consider the
example from Fig. 5.15, the s-challenges set in this case should contain only around
200 vectors to prevent building a model with accuracy more than 80%.
The PUF operates normally for the s-challenges; but if a challenge that is not in
this subset is used, the PUF will not display an authentic, in other words, random
data will be outputted as a response.
A possible implementation of this technique is shown in Fig. 5.17, wherein a
challenge validation logic is used to check whether or not the applied challenge is
part of the s-challenges, if yes then the challenge is applied to the PUF, otherwise it
will be replaced by a random bit string generated locally by a random number
generator.
This is a mitigation technique wherein hash functions are used at the input/output of
the PUF, which means an adversary may not be able to have a direct access to the
challenge/response pairs, which in turn makes mathematical cloning of the PUF
infeasible [27]. Although the use of hash function is a very effective approach, it
incurs significant increase in chip area and energy costs, which may not be
affordable in some applications such as low end IoT devices [28].
All of the above techniques can protect the PUF from a modelling attack by a
snooping or black box adversary but do not provide any protection against white
box adversaries.
The notion of permutation consists of rearranging the order of the elements in set.
This concept was employed to design a number of historical ciphers such as Skytale
form the Greeks time wherein the encryption is done using a device which consists
of a rod and a polygon base. To encrypt, one would write the message horizontally
on a piece of paper wrapped around the device. When the paper is unwound, the
letters on it would appear randomly written with no meaning as shown in Fig. 5.18.
Only a legitimate receiver who has an identical device can decrypt the cipher.
Permutation techniques are also used in modern ciphers such as AES and
triple-DES [5] to dissipate the redundancy of plaintext by spreading it all out over
the cipher text, which makes it harder to establish a statistical correlation between
the two. The same concept can be applied in the context of PUF in order to obscure
the relationship between the challenges and their associated responses, hence make
machine learning harder or infeasible [20].
One way to implement this approach is to permute the challenges as shown in
Fig. 5.19, in this case the permutation block is placed before the PUF circuit. Each
challenge is then replaced by another before being applied to the inputs of the PUF.
This approach can be implemented on both single and multibit response designs.
The mechanism of permutation is as follows: each n-bit challenge is divided into
k segments of length l wherein l ¼ nk. These segments are then permuted to generate
a new challenge, as shown in Fig. 5.20.
In principle, the larger the number of segments k(or the smaller the number of bit
in each segments) the better results that can be achieved, as this will increase the
number of possible permutation k! which makes it harder for an adversary to find
the original order of bits. The optimum way of permutation can be found using an
iterative algorithm which searches for a solution to maximise the unpredictability
metric as described in Sect. 5.4.2.
Response permutation is only possible for multibit response PUFs, but it may
not be very useful on its own as it only changes the order of the response bits, so
multibit machine learning algorithms can still build a mathematical model for each
response bit.
Example 5.7 A PUF design has the CRPs’ behaviour described in Table 5.4. For
this design, two challenge permutation approaches are applied as follows:
(a) 1 ! 5; 2 ! 1; 3 ! 2; 4 ! 3; 5 ! 4
(b) 1 ! 5; 2 ! 4; 3 ! 1; 4 ! 2, 5 ! 3
The resulting challenge/response pairs in each case are shown in Table 5.4.
(1) Use the Hamming distance metric HDT(t,e) to evaluate the unpredictability of
the original PUF using t = 1.
(2) Which of the above two permutation produces a less predictable design?
Solution:
(1) To evaluate HDTð1; eÞ, we need to identify for each challenge cj , the related
challenges ci such that d cj ; ci ¼ 1, then calculate the Hamming distance
between their respective responses ðrj ri ). The HDTð1; eÞ for each challenge
is then computed as the average Hamming weight of ðrj ri ) for each chal-
lenge. The results are summarised in Table 5.5.
Based on the results in Table 5.5, we can now compute the mean and standard
deviation percentage for the HDTð1; eÞ using Eqs. (5.9) and (5.10) as follows.
5.9 Countermeasures of Mathematical Cloning Attacks 161
1 X
lðHDTð1; eÞÞ ¼ HDTð1; e; cÞ ¼ 0:2
jC j c2C
rP
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
c2C
ðHDTðt;e;cÞlÞ2
rðHDTðt; eÞÞ jC j1 0:245
¼ ¼ ¼ 1:224
lðHDTðt; eÞÞ lðHDTðt; eÞÞ 0:2
The results show both measures are far from their ideal values, which indicate
the original design is predictable.
(2) To find the best permutation approach, we repeated the same computation as
above for both the cases of permutation as summarised in Tables 5.6 and 5.7.
Based on the results in Table 5.6, we can now compute the mean and standard
deviation percentage for the HDTð1; eÞ Eqs. (5.9) and (5.10) as follows:
1 X
lðHDTð1; eÞÞ ¼ HDT ð1; e; cÞ ¼ 0:5
jC j c2C
rP
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðHDT ðt;e;cÞlÞ2
c2C
rðHDTðt; eÞÞ jC j1 0:316
¼ ¼ ¼ 0:63
lðHDTðt; eÞÞ lðHDTðt; eÞÞ 0:5
Based on the results from Table 5.7, we can now compute the mean and standard
deviation percentage for the HDTð1; eÞ Eqs. (5.9) and (5.10) as follows:
1 X
lðHDTð1; eÞÞ ¼ HDT ð1; e; cÞ ¼ 0:6
jC j c2C
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P ffi
ðHDTðt;e;cÞlÞ2
c2C
rðHDTðt; eÞÞ jC j1 0:37
¼ ¼ ¼ 0:63
lðHDTðt; eÞÞ lðHDTðt; eÞÞ 0:6
The results show that permutation (A) improves the unpredictability of the
design while permutation (B) gives a very similar profile to the original PUF
behaviour, therefore, permutation (A) is the best option in this case.
The notion of substitution consists of replacing the elements in a set with new
elements from a different set according to a fixed system. The main difference
between substitution and permutation is that in the latter, the elements of the
original set are re-ordered but left unchanged whereas substitution will replace the
elements themselves with new set of elements. For example, let us assume we have
5.9 Countermeasures of Mathematical Cloning Attacks 163
For this case study, we have implemented an arbiter PUF which has 16-bit input
challenges and a single-bit output response in a 65-nm technology node. The design
has been simulated using the BSIM4 (V4.5) transistor model with a nominal supply
164 5 Security Attacks on Physically Unclonable Functions …
This design uses the generic architecture shown in Fig. 5.21, wherein a ran-
dom mapping of challenges is used to design the permutation block, wherein
each challenge is divided into k sections, each contains L bits, those sections
are then permuted. We consider three cases, here, k = 4, 8 and 16.
(b) Substitution-Based Challenge Obfuscation
In this case, each challenge is divided into two sections, each contains one
byte (8 bits), these bytes are then substituted using the AES substitution
functions, and applied to the PUF.
(c) Substitution-Based Response Obfuscation
Each challenge is used as seed for a 16-bit LFSR which generates 16 corre-
sponding challenges. The obtained challenges are applied to the PUF
sequentially to generate a 16-bit response in each case. Each response is then
divided into two sections, each contains one byte (8 bits), these bytes are then
substituted using the AES substitution functions.
In each of the above cases (A, B and C), 18000 CRPs are generated, both SVM
and ANN techniques are then used to develop a software model of the design. The
machine learning results are shown in Figs. 5.22 and 5.23.
The results in Fig. 5.22 show that the permutation can help reduce the prediction
accuracy of SVM and ANN machine learning models by up to 25 and 10%,
respectively, to achieve the maximum reduction, the number of bits in each per-
muted block should be 1 (i.e. k = 16). The results in Fig. 5.23 show that substi-
tution is a more effective approach, compared to permutation, especially when
applied on the response bits.
r1
MulƟ-Bit r2
Challenge(c) SubsƟtuƟon
Response SubsƟtuted
PUF Block
Responses(R)
Architecture
rm
Side channel attacks exploit information leaked from the physical implementation
of a cryptographic primitive, such as power consumption, timing delays and
electromagnetic noise. These attacks can be classified based on their implementa-
tion strategies into two main categories, the first requires information on the internal
structure and operation of the hardware implementation of the system under attack
such as fault injection methods, and the second type may not necessarily need such
information, for example differential power analysis.
In this section, we will explain the principles of the main side channel attacks on
PUFs.
166 5 Security Attacks on Physically Unclonable Functions …
An example of such attack has been described in [29], where authors introduced the
concept of ‘Repeatability’, which refers to the short-term reliability of a PUF that is
affected by temporal noise sources (e.g. power supply fluctuations). This attack
consists of two stages.
(a) An experimental stage, which consists of injecting noise into the PUF circuit
and measuring the repeatability of its responses.
(b) Statistical analysis stage which infers the internal structure of the device and
produces a mathematical clone.
We are going to briefly explain how this attack works using the arbiter PUF
depicted in Fig. 5.12.
Let c be the minimum resolution time of the arbiter (i.e. the minimum difference
in the arrival times of the two input signals that can be detected by the arbiter).
The difference of the arrival times Dt is the sum of two independent components
as follows:
where
DtV is caused by the variation-induced delay difference of the two signal paths
associated with each challenge, this component is dependent only on
manufacturing variations.
DtN is caused by random temporal noise such as power supply fluctuation and
changes in the ambient temperature.
Both of these components can be considered to have Gaussian distributions
according to the central limit theorem because they are generated by complex
processes (i.e. random variations or random noise).
The core principle of this attack is to systematically inject noise and then
measure its impact on the response bits. To achieve this, a Gaussian noise com-
ponent is injected into the power supply, and the response of the arbiter PUF is
measured repeatedly for the same challenge.
The experiment is repeated for different mean values of the noise signal, in each
case, the repeatability metric (REP) is estimated as the fraction of the responses
which evaluate to ‘1’ for a specific challenge, therefore REP 2 ½0; 1 .
REP can be computed by integrating the probability distribution function of Dt
as follows:
1 c Dt
REP ¼ erfc pffiffiffi ð5:19Þ
2 2rN
5.10 Side Channel Attacks 167
where
Dt is the mean value of the time difference
rN is the standard deviation of the noise signal
This can be rewritten as
1 d DtV DtN
REP ¼ erfc pffiffiffi ð5:20Þ
2 2rN
This means
pffiffiffi
DtV ¼ c þ DtN 2rN erfc1 ð2REPÞ ð5:21Þ
There are a number of examples in the literature wherein power consumption infor-
mation are exploited to wage more effective mathematical cloning attacks [30, 31].
In this section, we are going to explain the generic procedure of these attacks
using the arbiter PUF (Fig. 5.12). The arbiter circuit consumes more power when
generating a logic ‘1’ response than when generating a logic ‘0’, therefore by
applying power tracing to this circuit, one can predict the output of the PUF.
The presence of noise makes it difficult to obtain specific response bits from the
power measurement, therefore, machine learning algorithms that require a set of
challenge response pairs such as ANN and SVM cannot be employed in this case.
Other techniques, which can be applicable in this context, include evolution
strategies techniques, which start with a random model of the PUF, then iteratively
refine this model using available data (e.g. power consumption information) until an
168 5 Security Attacks on Physically Unclonable Functions …
Fig. 5.24 Power analysis attack using an evolution strategies learning algorithm
accurate model is derived. A generic procedure of this attack that is based on the
use of ES learning algorithm is shown in Fig. 5.24.
The authors of [31] have shown that it is possible to build a model of a 128-bit
arbiter PUF, implemented using 45 nm CMOS technology, with a prediction
accuracy more than 90% using 15000 challenges and their associated power traces,
their experiment also assumed a noise equivalent to that generated by one hundred
switching registers in the same technology node.
It should be noted here that the number of challenges needed for this attack is
dependent on the noise level, therefore, if the latter can be reduced, then it will be
possible to significantly reduce the number of power measurements required [31].
There are other forms of power analysis attacks which use Hamming distance as
a correlation metric, these approaches exploit the fact that the power consumed by a
memory element (e.g. a shift register) when its content changes is dependent on the
Hamming distance between the new content and the old one. Let us give an
example of this. Consider a 8-bit register that contains the binary string (00001000),
if we are to write a new word (11111000), we would need to draw current from the
power supply to change the four most significant bits from ‘0’ to ‘1’, on the other
hand, if we are to write (11001000), we would draw less current as we only need to
5.10 Side Channel Attacks 169
make two transitions. This principle has been successfully exploited to deduce
information form secure-sketch-based PUF designs [32].
Solution:
Table 5.8 can be used to generate 12 linear equations which indicate the rela-
tionship between different response bits, in this case we will only need four inde-
pendent equation to find a solution as follows
From the above equations, one can deduce two possible solutions
r1 r2 r3 r4 ¼ 0011 or 1100, one of them is the key.
So far, we have discussed attacks which aim to create a mathematical model of the
PUF, such attacks may not be very useful in highly secure applications such as
smart cards and e-passports, wherein a clone can only be practically used if it has
the same form factor (i.e. physical size and shape) as the original device. In such
172 5 Security Attacks on Physically Unclonable Functions …
cases, creating a physical clone of the device may be the only practical threat; such
a clone should have the same functionalities, capabilities and physical properties of
the original device.
Physical cloning aims to clone physically unclonable functions, although this
may seem contradictory, the advancement of reverse engineering technique has
made physical cloning a possibility [3, 13].
This type of attacks generally consists of two stages: characterisation and
emulation. The aim of the first stage is to deduce the challenge/ response behaviour
of a PUF circuit from its electrical and physical characteristics (delay, radiation,
output consistency, etc.). The goal of the second stage is to construct a physical
implementation, which replicates the learned behaviour. We will now discuss
existing techniques used in each stage.
There are a number of methods with which the integrated circuits of a PUF can be
characterised. These include side channel analysis such as electromagnetic radiation
and power consumption. For example, the authors of [3] de-capsulated an RO PUF
implemented on an FPGA board and held on-die EM measurements. These are then
used to deduce the frequency ranges of each ring oscillator in the design, this
allowed full characterisation of the design.
Photonic emission (PE) in CMOS is another technique that was used to physi-
cally characterise an arbiter PUF [2]. PE phenomenon occurs during the switching
of CMOS devices. For example, let us consider the NMOS transistor in Fig. (5.26).
In this case, electrons travelling through the conducting channel are accelerated by
the electric field between the source and the drain, when they arrive at the pinch-off
region at the drain edge of the channel, their kinetic energy can be released in the
form of photons (illustrated as a red arrow in Fig. 5.26). The wavelength of the
generated light particles is dependent on the energy of the accelerated carries, it can
range from the visible to the near-infrared wavelengths in some CMOS technolo-
gies. This phenomenon is also present in PMOS transistors, but the generated
photons have less energy, due the lower mobility of holes compared to electrons,
which mean the emitted photons are much harder to observe.
The authors of [2] measure the time difference between enabling the arbiter PUF
and the photon emission at the last stage (detected from the back side of the chip),
this allowed them to estimate the delay of each individual path in the design. The
experiment was repeated for different challenges; they showed it is possible to
estimate the delay of all the stages of a 128-bit arbiter PUF with 1161 challenges,
which is significantly less than the number of CRPs required to carry out a
mathematical cloning attack.
Photonic emission analysis has also been proven effective in capturing the
start-up values of an SRAM PUF [13]. Data remanence decay in volatile memories
is another mechanism, which has been proven effective in characterising the
SRAM PUF [40]. This method consists of two phases, in the first stage, an attacker
overwrites the start-up values of the SRAM memory with known contents, and then
he gradually reduces the supply voltage and captures the memory content itera-
tively. In the second stage, he analyses the results to deduce the PUF responses, in
principles, the SRAM cells that remain stable have a start-up value equivalent to the
stored known content, while others which change their status have a start-up value
equal to the reverse of the forcibly stored contents.
The advancement of reverse engineering techniques, such as PEA and FIB, means
that it is feasible for an adversary to acquire detailed information on the layout of a
silicon chip and potentially deduce its intended functionality. This means, he may
be able to reproduce a physical replica of the original device if he has access to the
same or closely related implementation technology. This makes it hard to devise
countermeasures that can achieve 100% protection against physical emulation (i.e.
the second stage of a physical attack).
Therefore a more effective approach, in this case, is to develop solutions to
prevent physical characterisation, i.e. the first stage of the attack, such methods may
include the countermeasures for side channel analysis described in Sect. 5.11 and
also industry standard techniques such as memory scrambling [41, 42].
Table 5.9 shows a summary of the attacks discussed in this chapter, which includes
the adversary type and the compromised security metric in each case.
Mathematical cloning attacks are the most likely threat as they can be carried out
by all three types of adversaries. They mainly compromise the unpredictability
property of a PUF by constructing a model, which can predict responses for all
future challenges; they can also affect its Uniqueness, if such a model is used to
construct another device with the same challenge/response behaviour.
These attacks have been successfully carried out on most existing PUFs with
large challenge/response space such as delay-based designs. Other architecture such
as memory-based PUFs are inherently resilient to mathematical modelling as the
total number of their CRPs is typically less than that required by most machine
learning algorithms.
5.14
• There are four metrics which need to be used collectively to assess the security
of a PUF design, namely randomness, unpredictability, uniqueness and physical
Unclonability.
• There are three types of adversaries which can be considered when formally
reasoning about the security of PUFs, namely: a snooping, a black box and a
white box adversary. The first can only listen to communication from/to a PUF
circuit, the second has a physical access to the chip incorporating the PUF and
can carry out non-invasive attacks and the third can carry out both semi and
fully invasive procedures such as micro-probing and reverse engineering.
• There are a number of attacks on physically unclonable functions, which pose a
serious risk to their potential use as a new root of trust in security protocols.
These attacks can generally be classified into three categories according to the
means by which the attack is carried out, mathematical modelling using machine
learning algorithms, side channel analysis and physical cloning.
5.15 Conclusions and Learned Lessons 177
5.16 Problems
5:1. Table 5.10 shows the challenge/response behaviour of two PUF designs, each
of which has 3-bit challenges and 6-bit responses.
(a) Calculate the uniformity and output entropy for each design
(b) Which of these two PUFs, if any, can be used as true random bit generator?
5:2. Table 5.11 shows the challenge/response behaviour of a PUF that has 3-bit
challenges and 4-bit responses.
(a) Use the Hamming distance metric HDTðt; eÞ to evaluate the unpre-
dictability of this PUF using t ¼ 2.
(b) A permutation approach is implemented on Design 1 to enhance its
unpredictability; it consists of a left circular shift of each challenge before
applying it to the PUF. Evaluate the unpredictability of the resulting
design using t ¼ 2.
(a) How long will it take him to carry out this attack assuming he needs one
microsecond to obtain one challenge/response pair?
(b) One method to implement MTR is by inserting a logic delay at the output
of the PUF, and estimate the approximate value of this delay in order to
increase the time needed for this attack to 30 days.
(c) Assume that the above delay block needs to be implemented using a
65 nm technology, in which the delay of an inverter is 1.1 ns, estimate
how many of these inverters will be needed.
5:4. A helper data leakage attack has been carried out on a key generation scheme
based on a 4-bit RO PUF, the results of the experimental phase of the attack
are summarised in Table 5.12.
Analyse the data and deduce the possible keys, assuming the used key is the
same as the response of the PUF.
5:5. The AES S-box is used to enhance security of a PUF design against mathe-
matical cloning. It is assumed that the PUF has a reliability metric of 98.875%,
which is estimated based on the expected number of errors due environmental
variations and other sources of noise. It is also assumed that the PUF has 8-bit
challenges and 8-bit responses
(a) Estimate the reliability of the resulting enhanced PUF design if the S-box
is placed on its input.
(b) Estimate the reliability of the resulting enhanced PUF design if the S-box
is placed on its output.
(c) Based on your answers to the above, explain whether or not the position of
the S-box has any effect on the reliability of the resulting design.
5:6. Table 5.13 includes a summary of the security metrics of three PUF designs.
Study these data then choose a suitable PUF design for each of the following
applications:
References
1. J. Delvaux, I. Verbauwhede, Fault injection modeling attacks on 65 nm Arbiter and RO sum
PUFs via environmental changes. IEEE Trans. Circ. Syst. I Regul. Pap. 61, 1701–1713 (2014)
2. S. Tajik, E. Dietz, S. Frohmann, J.-P. Seifert, D. Nedospasov, C. Helfmeier, et al., Physical
Characterization of Arbiter PUFs, in ed. by L. Batina, M. Robshaw. Cryptographic Hardware
and Embedded Systems—CHES 2014: 16th International Workshop, Busan, South Korea,
September 23–26, 2014, Proceedings (Springer, Berlin, 2014), pp. 493–509
180 5 Security Attacks on Physically Unclonable Functions …
44. M. Backes, A. Kate, A. Patra, Computational verifiable secret sharing revisited, in ed. by D.H.
Lee, X. Wang, Proceedings on Advances in Cryptology—ASIACRYPT 2011: 17th Interna-
tional Conference on the Theory and Application of Cryptology and Information Security,
Seoul, South Korea, December 4–8, 2011 (Springer, Berlin, 2011), pp. 590–609
45. C. Alexander, G. Roy, A. Asenov, Random-Dopant-induced drain current variation in
nano-MOSFETs: a three-dimensional self-consistent Monte Carlo simulation study using
(Ab initio)Ionized impurity scattering. IEEE Trans. Electron Devices 55, 3251–3258 (2008)
Hardware-Based Security Applications
of Physically Unclonable Functions 6
6.1 Introduction
(1) Explain how PUF technology can be used to securely generate and store
cryptographic keys.
(2) Discuss the principles of PUF-based entity authentication schemes.
(3) Explain how PUF technology can be employed to construct hardware-assisted
security protocols.
(4) Outline the principles of PUF-based secure sensors design.
(5) Explain how PUFs can be used to develop anti-counterfeit solutions and
anti-tamper integrated circuits.
It hoped that this chapter will give the reader an in-depth understanding of the
existing PUF applications, their design requirements and outstanding challenges.
The remainder of this chapter is organised as follows. Section 6.3 illustrates, with a
detailed case study, how to design and implement a PUF-based key generation
scheme. Section 6.4 explains the principles of existing PUF-based authentication
protocols and discusses the motivation behind their usage in this type of applica-
tions. Section 6.5 outlines the principles of hardware-assisted security protocols and
explains with numerical examples how PUFs can be used in this context. Sec-
tion 6.6 examines the use of this technology to construct secure sensors and gives
184 6 Hardware-Based Security Applications of Physically Unclonable …
6.3.1 Motivation
Security is one of the fundamental requirements of electronic systems that deal with
sensitive information; especially, these used in smart cards, mobile phones and
secure data centres. Therefore, these systems should be capable of protecting data
confidentiality, verifying information integrity and running other necessary secu-
rity-related functions. Such requirements are typically achieved using cryptographic
primitives such as encryption cores and hash functions. These cryptographic blocks
normally require a secret key that should be only known to authorised persons or
devices.
Typically, secret keys are generated on the device from a root key, which must
be unpredictable (i.e. cannot be easily guessed) and securely stored (i.e. cannot be
accessed by an adversary).
Nowadays root keys are typically generated outside the chip during the manu-
facturing stage of silicon devices, and subsequently stored in an on-chip memory.
This process is usually referred to as key provisioning, once completed; the on-chip
cryptographic block can provide security services (e.g. data encryption, device
authentication and IP protection) to the applications running on the device.
There is typically an application key provisioning function, which allows the
operating system and other software to derive their own application keys based on
the root key.
There are generally two types of storage medium used for the secret keys. The
first is one-time programmable (OTP) memories, wherein a fuse or an anti-fuse
locks each stored bit; in this case, the data need to be written during chip manu-
facturing and cannot be changed later on. Another approach for storing the key is
the use of non-volatile memories such as Flash, FRAM and NRAM.
The above described key provision approach suffers from two problems:
(a) First, the use of OTP memories for root key storage is handled by the device
manufacturers, which adds extra costs to the fabrication process, it can also
pose a security risk as the vast majority of devices are fabricated in third-party
facilities, which cannot always be trusted.
(b) Second, storing the keys in a non-volatile memory makes it vulnerable to
read-outs attacks by malicious software, an example of such attacks is
described in [1], which demonstrates how a malware can gain an unauthorised
6.3 Cryptographic Key Generation 185
This section examines more closely the requirements a PUF design should satisfy in
order to be used as the basis of a key generation scheme, these can be summarised
as follows:
(a) High Reliability: which means the PUF response remains the same regardless
of noise and environment variations, this ensures stable key generation,
without which decryption would be very difficult, if not impossible. It is
typically the case that PUF responses are not perfectly reproducible; therefore
there may be a need to use additional circuity for errors correction.
(b) Uniqueness: the key should be unique for each electronic system, so that if one
device is compromised, the others remain secure; ideally, the uniqueness
should be 50%.
(c) Randomness: the key should be random so it is impossible for an adversary to
guess it or compute it, it is typically the case that PUF responses are not
uniformly distributed, and therefore, there may be a need for an additional
circuitry in order to compress enough entropy in a PUF-generated key; ideally,
a random response should have a uniformity of 50%.
The process of generating a cryptographic key from a PUF consists of two main
stages shown in Fig. 6.1:
This is carried out only once by the developer/seller of a device, it includes the
following procedure:
186 6 Hardware-Based Security Applications of Physically Unclonable …
(1) Pre-processing: the aim of this stage is to estimate the maximum bit error
rate of the PUF response when the same challenge is applied under different
environment and noise conditions. It can also include the use of a number of
approaches to reduce this error rate such as ageing acceleration, and reliable
bit selection (see Chap. 4 for more details on this). This process takes place
at the design stage; it requires fabrication of test chips and evaluation of the
reliability of the PUF design under realistic operating conditions.
(2) Helper Data Generation: wherein helper data are generated for each PUF
implementation using one of the fuzzy extractor schemes explained in
Chap. 4, and these data are then stored into each PUF-carrying chip, in
principles, helper data should leak no information on the key, so it can be
stored on chip without any additional protection measures. This step takes
place after the chip fabrication process is completed.
(3) Device Enrolment: the root key stored on each device is read-out and
stored securely by an authentication authority to be used for secure
communication with the device in question at a later stage.
(1) Stable Response Construction wherein the PUF is provided with a specific
challenge or simply powered-on (e.g. SRAM PUF), its output is then fed into
an error correction block to generate a reliable response as shown in Chap. 4.
(2) Privacy Amplification wherein the generated response is applied to an entropy
compression block (e.g. a hash function) to enhance its randomness.
(3) Key Derivations: The output of the last step is used to derive one or
multiple keys for different security tasks (encryption, identification, etc.).
(4) Finally, the PUF is powered-off, so that its output is no longer accessible.
6.3 Cryptographic Key Generation 187
This technique has a number of advantages over the existing approaches; first, it
ensures that the root key is only available temporarily when it is needed (e.g. during
an encryption/decryption operation). This makes it harder for an adversary to
exploit side channel information to deduce the key. Second, there will be no need
for the chip manufacturer to handle key provisioning, which reduces fabrication
costs and improved security. Third, the use of PUF technology makes it easier to
obtain a unique key for each device, which makes the key provisioning process
significantly easier and less prone to errors.
This section illustrates the design process of a PUF-based 256-bit key generation
scheme. A memory-based PUF design is adopted due to the stability of their
response in the presence of environmental variations [7]. A 2 MB embedded
SRAM on an Altera DE2-115 FPGA board is used in this experiment [8], it has a
20-bit address input line and a 16-bit data output, and these are used as
challenge/response pairs. To simplify the design process, only one source of
environment noise is considered, namely, power supply variations. The design
process is explained in detail below:
1. Reliability Characterisation:
The purpose of this step is to estimate the worst-case bit error rate in a given
PUF response, in order to identify a suitable error correction scheme. Ide-
ally, a comprehensive assessment should be carried out that includes all
possible sources of noise as explained in Chap. 3. In this example, for
simplicity, the power supply fluctuation is the only source of noise con-
sidered. The experimental procedure is outlined below:
The results indicated that the responses can be categorised into two types
according to their maximum bit error rates, 64 responses had a maximum bit
error rate of 510-4. The remaining responses had a higher error rate > 10-3.
188 6 Hardware-Based Security Applications of Physically Unclonable …
k
r¼ ð6:1Þ
CR SR
In this case study, a Hamming code (15, 11, 1) is chosen, which meets the
requirement of the decoding error < 10-6. To demonstrate this, we recall
Eq. (4.4) from Chap. 4.
In this example, we have:
m 11
CR ¼ ¼
n 15
r (420 bits)
SRAM
PUF w (420 bits)
c (420 bits)
x (308 bits)
RNG Hamming Encoders
multiple readings of the selected memory space are estimated, the results
indicated an average SR of 0.85.
The minimum number of raw PUF response needed to generate a 256-bit
cryptographic key can now be computed using Eq. (6.1) as follows:
k 256
r¼ ¼ 11 410 bits
CR SR 15 0:85
In this case, the PUF responses will be generated from the start-up values of
the memory addresses identified in step a.2, each of which has 16-bits value,
and only the first 15 will be used from each response (to allow the use of the
Hamming code chosen). This means, the total number of addresses that
need to be read (i.e. the number of challenges that need to be applied to the
PUF) is given as follows: 410
15 ¼ 27:3. This effectively means the contents of
28 addresses need to be read-out, which gives 420 bits.
4. Helper Data Generation:
In this step, a helper data vector is generated using a code-offset scheme
described in Chap. 4 [11], the detailed process is explained as follows (see
Fig. 6.2):
(1) A 420-bit raw response bits are generated by reading out the start-up
value of a 28-memory addresses.
(2) A random binary string x whose length is 308-bits is generated using a
random number generator (RNG).
(3) x is divided into 28 data vectors, each is 11-bit long. These are applied
in parallel to Hamming encoder to generate 28 valid codewords, these
are concatenated to generate (c) that is a 420-bit long vector.
(4) c is XORed with the PUF response (r) to generate a helper data vector w.
K(256 bits)
SRAM r’ (420 bits) Hamming r SHA
PUF Decoding 256
420 bits
(a) A 420-bit response vector is generated by reading out the start-up values of
the same 28 memory locations used in the previous stage.
(b) Each 15-bit response is XORed with its corresponding helper data w to
obtain noisy code vector c0 ¼ r 0 w:
(c) The 28 obtained noisy codewords are applied to Hamming decoders to
generate 28 * 11-bit data vectors.
(d) The outputs from all decoders are concatenated, then XORed with the
corresponding helper data to generate a 308-bit stable response.
(e) Finally, the stable response bits are inputted into SHA-256 Hash function
block to generate a 256 key.
The above scheme was implemented on an Altera DE2-115 FPGA board, it was
tested by carrying out repeated readings of PUF responses and estimating their
respective Hamming distances from the original response. The design has
successfully generated the original response in each case in the presence of
supply voltage variations. The obtained key has a uniformity of 49.6% which is
very close to the ideal value of 50%. It worth noting here, that in practical
applications, more rigorous analysis are needed to identify the realistic bit error
rate at the output of the PUF, according to the operating conditions and
expected sources of noise. In extreme scenarios, if the error rate is large, the
cost of error correction may be prohibitive. In these cases, one solution is to
adopt a multi-level optimization approach to improve the reliability of the PUF.
This may include using robust circuit design techniques, preprocessing
approaches (e.g. aging acceleration), in addition to error correction circuitry.
6.4.1 Motivation
(a) Corroborative evidence of its claimed identity which could have only been
generated by the entity itself.
(b) A proof that entity was actively involved in generating this evidence at the
time of the authentication.
To achieve this, the entity needs to convince the authenticating authority that it
has exclusive access to secret information (e.g. a binary code), and demonstrate it
can generate a proof of this secret as and when requested.
Generally, there are two stages of an entity authentication scheme:
(a) Identity Provisioning: wherein each device obtains a unique identity such as a
serial number or a binary code.
(b) Verification Phase: wherein the verifier validates the identity of each entity.
(1) Verifier (V) or a trusted third party gives each entity (E), a unique secret key
(k) and a unique identifier (ID).
(2) Verifier stores this information in its database.
(1) Entity starts the process by sending its (ID) to the verifier.
(2) Verifier looks up the (k) corresponding to a specific E with the unique (ID).
(3) Verifier sends a random number (nonce) to the entity.
(4) Entity encrypts (nonce) using the shared key (k), generates a response (ne) and
sends it back to the verifier.
(5) Verifier decrypts the received response (ne) using the shared key (k).
(6) If the decryption process generates the same (nonce), then the entity can be
authenticated because it has proved its possession of the key (k); otherwise,
the authentication process fails.
192 6 Hardware-Based Security Applications of Physically Unclonable …
(1) It requires assigning each entity a unique binary key that needs to satisfy strict
randomness conditions (it should not be easy for an adversary to guess it).
This process, referred to as key provisioning, should take place before devices
are deployed to the field or given to customers; this poses technical and
logistical challenges, and leads to an increased production cost. For example,
the vast majority of silicon chips are manufactured by third parties who cannot
necessarily be trusted for key provisioning or may not have the technical skills
to generate a unique random key for each device.
(2) It requires the implementation of an encryption algorithm or a keyed hash
function on each entity, this may be unaffordable in resources-constrained
systems such as low-end IoT devices and RFID [12, 13].
It is worth noting that the above qualities of the PUF can only be evaluated with
respect to the requirements of a specific authentication protocol. In other words,
some protocols limit the number of challenge/response pairs used, which means it
will not be possible for a snooping adversary to construct a model of the PUF using
mashie learning algorithms, this in turns relaxes the requirement of mathematical
unclonability. Other applications can tolerate a higher rate of authentication failure,
which means reliability requirement can be relaxed.
6.4 Entity Authentication 193
This protocol can be used for a unilateral authentication, wherein a central authority
acts as the verifier and PUF-embedded devices as the entities. There are a number of
variations of this protocol that have been proposed, but they all have the following
basic procedure shown in Fig. 6.4 [14]:
(a) Enrolment
(1) The verifier or a trusted third party embeds in each entity a PUF circuit
and gives it a unique identifier (ID).
(2) Verifier applies a large number of challenges on each PUF instance and
records the corresponding responses.
(3) Verifier creates a secure database, in which he stores the IDs for all
entities with their corresponding PUF challenge/response pairs.
(b) Verification
(1) False Acceptance Rate (FAR): which refers to the probability that a PUF of a
physical entity generating a response identical to that of a PUF implemented
on another entity. This leads to false identification.
(2) False Rejection rate (FRR) is the probability of a genuine physical entity
generating an invalid response due to temporal noise.
Fig. 6.5 Intra- and inter-hamming distance distributions of a 32-bit arbiter PUF
6.4 Entity Authentication 195
FRR and FAR can be estimated using the HDInter and HDIntra as follows [15]:
t
X
M i Mi
FRR ¼ 1 ð^p HD Intra Þ 1^
pHD Intra ð6:2Þ
i¼0
i
t
X
M i Mi
FAR ¼ ^pHD Inter 1^
pHD Inter ð6:3Þ
i¼0
i
wherein
M is the number of PUF output bits (i.e. the bit length of its response per
challenge).
t the identification threshold or the maximum number of bit error in the PUF
response that can be tolerated, i.e. do not cause a false rejection.
^pIntraPQ ; ^pIntrePQ are the binomial probability estimators of HDIntra and HDInter ,
respectively.
It is worth noting here that the above equations are only applicable if HDIntra and
HDInter have binomial distributions.
Ideally, both FRR and FAR need to be minimised; however, Eqs. (6.2) and (6.3)
indicate that in order to minimise FRR, t needs to be as large as possible so the bit
flips caused by temporal noise do not lead to rejecting a genuine response. On the
other hand, increasing t can aggravate FAR as it increases the probability of false
identification. Therefore, in practice, t is chosen to balance FRR and FAR. In other
words, to make them equal. This value of t is referred to as the equal error threshold
(tEER ). For discreet distributions, it may not be possible to find such a value, instead
tEER is computed as follows:
To illustrate the impact the choice of a threshold value has on the admission and
rejection rates, the following numerical examples are given below:
Example 6.1 The challenge/response behaviour of a PUF is given in Table 6.1.
This design is used in the authentication protocol depicted in Fig. 6.4. The PUF is
expected to operate under four different environmental conditions (i.e. power
supply fluctuations and ambient temperature). The nominal conditions at which the
196 6 Hardware-Based Security Applications of Physically Unclonable …
PUF was enrolled at are (T ¼ 25 C; Vdd ¼ 1v). It is assumed that all challenges
listed in Table 6.1 have equal probabilities, and all the listed environment condi-
tions are also equally probable.
(1) Compute the false rejection and admission rates assuming the verifier cannot
tolerate any errors, i.e. the threshold t = 0.
(2) Repeat the same computations above but with a threshold t = 1.
(3) Which of the threshold values produce the minimum equal error rate (EER)?
Solution:
(1) The PUF has four valid responses listed next the nominal conditions
ðT ¼ 25 C; Vdd ¼ 1vÞ. To compute the false rejection rate, one needs to look
for cases wherein the Hamming distance between the PUF output and any of
the four valid responses is more than 0. There are eight of these cases written
in a bold font in Table 6.2. As it is assumed all table entries have the same
probability, the false rejection rate can be computed as follows:
8
FRR ¼ ¼ 0:5
16
To compute the false admission rate, one needs to look for cases wherein the
PUF output has deviated from its expected response and generated another
valid response. There are two of such cases written in an italic underlined font
in Table 6.2, therefore
Table 6.2 False rejection (bold fonts) and false admission (italic font) cases (t = 0)
Environment conditions Challenges
c0 ¼ 00 c1 ¼ 01 c2 ¼ 10 c3 ¼ 11
T ¼ 25 C; V dd ¼ 1v 00000000 00000111 00111000 11111111
T ¼ 75 C; V dd ¼ 1v 00000000 00000111 00110000 11111000
T ¼ 25 C; V dd ¼ 1:2v 00000001 10000111 00100000 01111000
T ¼ 75 C; V dd ¼ 1:2v 00000001 11000111 00000000 00111000
6.4 Entity Authentication 197
Table 6.3 False rejection (bold font) and false admission (italic underlined font) cases (t = 1)
Environment conditions Challenges
c0 ¼ 00 c1 ¼ 01 c2 ¼ 10 c3 ¼ 11
T ¼ 25 C; V dd ¼ 1v 00000000 00000111 00111000 11111111
T ¼ 75 C; V dd ¼ 1v 00000000 00000111 00110000 11111000
T ¼ 25 C; V dd ¼ 1:2v 00000001 10000111 00100000 01111000
T ¼ 75 C; V dd ¼ 1:2v 00000001 11000111 00000000 00111000
2
FAR ¼ ¼ 0:125
16
(2) In the case of a threshold value (t = 1), the verifier can tolerate up to 1-bit error
per response; therefore, to compute the false rejection rate, one needs to look
for cases wherein the Hamming distance between the PUF output and any of
the four valid responses is more than 1. There are two such cases in this
example as written in a bold font in Table 6.3, therefore
2
FRR ¼ ¼ 0:125
16
To compute the false admission rate, one needs to look for cases wherein the
PUF output has deviated from its expected response by more than 1 bit, but it
has a Hamming distance from another valid response that is less than 2. There
are four of such cases written in an italic underlined font in Table 6.3, therefore
4
FAR ¼ ¼ 0:25
16
EERðt ¼ 0Þ ¼ 0:5
EERðt ¼ 1Þ ¼ 0:25
(1) What is the best threshold value that minimises the probability of a denial of
service?
(2) What is the best threshold value that minimises the probability of a forged
PUF being authenticated?
(3) What is the best threshold value that reduces both of the above risks?
Solution:
The basic PUF authentication described above does not require the use of en-
cryption, which significantly reduces implementation costs of the prover hardware
compared to classic schemes; however, it has two main drawbacks, namely:
Slender PUF was proposed in [16] to overcome some of the disadvantages of the
basic PUF authentication protocol, in particular, the need for a verifier to store large
numbers of challenge/response pairs for each device. It works as follows:
6.4 Entity Authentication 199
(1) Verifier or a trusted third party embeds in each entity a PUF circuit and
gives it a unique identifier (ID).
(2) Verifier applies a large number of challenges on each PUF instance and
records the corresponding responses, and then it constructs a mathe-
matical model of for each PUF using machine learning algorithms.
(3) Verifier stores the IDs for all entities with their corresponding PUF
software models.
(1) Entity starts the process by sending its ID to the verifier and a random
binary vector (nonce e).
(2) Verifier checks the ID and sends back another random binary vector
(nonce v). Then, both entity and verifier concatenate the two nonce
vectors to generate (e, v).
(3) Entity uses a previously agreed upon pseudorandom function (G) to
generate a challenge based on the seed, and it then applies the challenge c
to its PUF instance and generates a response r of length m, where
r ¼ fb0 ; b1 ; . . .::bm g.
(4) Verifier uses the same random function to generate the same challenge
based on the seed, and then applies it to the PUF model corresponding to
the entity’s identifier (ID); this generates a response r 0 ¼ b00 ; b01 ; . . .::b0m .
(5) Entity selects a substring of the response of length s where subðr Þ ¼
bi ; bj ; . . .::bs and sends it back to the verifier along with the indexes of
the chosen bits, i.e. fi; j; . . .; sg.
(6) Verifier computes the Hamming distance between the received substring
subðr Þ and the corresponding substring from its generated response
subðr 0 Þ. If the Hamming distance is smaller than the authentication
threshold (t), then the entity is authenticated; otherwise, the authentica-
tion request is denied.
{ID,e}
1
v
2
3 c=G(e,v ) r={b0,b2,b3….bm}
{bi,bj...bs}, i,j..s
Is No
HD (sub (r), sub (r’))<t AuthenƟcaƟon
Fails
Yes
AuthenƟcaƟon
Succeeds
Both of the above approaches might increase the authentication time as well as
consume significant computing resources on the verifier side. One might argue, how-
ever, this is not a problem since the verifier has always been assumed rich in resources.
A slightly different approach is discussed in [18], wherein the challenges are
divided into two subsets: valid and invalid; the former are called the
secret-challenges or (the s-challenges), and the number of valid challenges in this
set should not be sufficient to allow an adversary to build a mathematical model.
6.5.1 Motivation
(1) No single party can learn about the inputs of other parties using the output of
the protocol.
(2) The input of each party should be independent of those of other parties.
(3) The output is only accessible by authorised parties as described by the protocol.
However, protocols that meet such requirements do not typically have efficient
implementations in practice [19]. This has given rise to hardware-assisted crypto-
graphic protocols; the latter rely on the use of tamper-proof hardware tokens to help
achieve the strong security guarantees set in the Canetti’s universal composition
(UC) framework [20]. In this type of protocols, the trust between communicating
parties is established through the exchange of trusted hardware tokens. One of the
first examples of such schemes was presented in [21], wherein government-issued
signature cards are used to generate public/private key pairs for digital signature
schemes. A second example includes the use of smart cards-based schemes in
private data mining applications [19]. A third example consists of the use of secure
memory device that restricts the number of times the memory contents can be
accessed, this functionality is particularly useful in applications such as intellectual
properties protection of software [22].
In this context, physically unclonable functions have been proposed as a suitable
technology for the construction of hardware-assisted secure protocols due to their
complex challenge/response behaviours, which make them intrinsically more
tamper-resistant than other hardware tokens that rely on digitally stored information
(e.g. smart cards or secure memory devices) [23].
This section describes three Hardware-assisted PUF-based cryptographic pro-
tocols, namely: key exchange, oblivious transfer and bit commitment.
202 6 Hardware-Based Security Applications of Physically Unclonable …
These protocols are used to allow the sharing of secret keys among two or more of
authorised parties for subsequent use in secure communication sessions. One of the
earliest PUF-based KE protocols was proposed in [24], it assumes the existence of a
secure physical transfer mechanism of the PUF. A similar approach was presented
in [25]; it is based on the use of universal composition framework [20], wherein a
key-carrying PUF is physically transferred from a sender to a receiver; in this case,
a one-sided authentication is required in order to prevent an adversary who gain a
temporary access to the PUF from impersonating the sender. A more elaborate
scheme was presented in [26], which will be explained in more details below. It is
also illustrated in Fig. 6.7.
(1) Bob applies two challenges c1, c2 to a PUF and obtains corresponding
responses r1, r2.
(2) The PUF is then physically transferred to Alice.
(3) Alice acknowledges the reception of the PUF using an insecure binary channel.
(4) Bob sends (c1, r1) and c2 to Alice.
(5) Alice applies c1 to the received PUF and get r1′.
(6) Alice compares r1′ and r1. If they are different, she terminates the commu-
nication; otherwise, she proceeds to step 7.
(7) Alice applies the challenge c2 to generate r2.
(8) Bob and Alice use r2 to derive a shared secret.
An Oblivious Transfer protocol in its simplest form enables a sender to transfer one
or multiple data items to a receiver while remaining oblivious to what pieces of
information have been sent(if any). The OT protocol has been first proposed in [27]
wherein a sender transfers a message to a receiver with 50% probability without
knowing whether or not his message is delivered. Another form of this scheme,
called 1-of-2 oblivious transfer, was proposed later on in [28]. The latter allows one
party (Bob) to retrieve one of two possible pieces of information from another party
(Alice), such that Bob does not gain any knowledge on the piece of data he has not
retrieved nor Alice establishes which of the two data items she holds has been
transferred to Bob. The 1-of-2 OT has been later generalised to k-of-n OT [29].
Oblivious transfer protocols can help realise a number of important cryptographic
applications including zero-knowledge proofs and bit-commitment schemes.
This section explains how PUFs can be used to construct an oblivious transfer
protocol using the 1-of-2 OT scheme proposed in [25].
The protocol described below requires a PUF with a large number of CRPs and
an authenticated channel; it is run between two players; a sender (Alice) and a
receiver (Bob).
6.5 Hardware-Assisted Cryptographic Protocols 203
1 c1,c2 r1,r2
{c1,r1},c2
4
5 c1 r1'
r1,r1' Is No
6 Abort
r1==r1'?
Yes
Proceed to 7
7
c2 r2
At the start, Alice holds two secrets b0 ; b1 2 f0; 1gc : And Bob makes a secret
choice s 2 f0; 1g.
After the protocol is executed, Bob will have learned one of Alice’s two secrets
(bs ), without gaining any knowledge on the other item, and Alice will have learned
nothing about Bob’s choice (s). The protocol consists of two stages as explained in
detail below:
(1) Alice generates two random values (x0 ; x1 ) and Alice sends them to Bob.
(2) Bob picks a challenge/response pair (c; r) from the database (DB) and
sends the value v ¼ ðc xs ) to Alice, For simplicity, let us assume that
Bob secret choice was (s ¼ 0).
(3) Alice applies the following two challenges to the PUF
{c0 ¼ v x0 ; c1 ¼ v x1 } and records the corresponding responses
{r0 ; r1 }.
In this case
c0 ¼ v x0 ¼ c xs x0 ¼ c x0 x0 ¼ c
c1 ¼ v x1 ¼ c x1 x0
bs¼0 ¼ r0 b0 r ¼ b0
Note r0 ¼ r as they are both responses to the same challnge cÞ: The protocol is
illustrated in Fig. 6.8.
It should be noted here that the protocol assumes that the responses, generated
by Alice in step b-4, are identical to those that would have been generated in step
a-1. This means the reliability of the PUF responses should be guaranteed; ideally,
the PUF should have a 100% reliability metrics (please refer to Chap. 4 for more
information on reliability enhancement techniques for PUFs).
The above protocol also assumes it is very unlikely that Bob can learn anything
about the other secret ðb1 Þ, because with an overwhelming probability, he would
not have measured ðc2 ; r2 Þ.
6.5 Hardware-Assisted Cryptographic Protocols 205
1 c1,c2...ck r1,r2...rk DB
{x0,x1}
1
v=c+xs
2
c0=c+xs +x0 r0
3 c1=c+xs +x1 r1
r0+b0
r1+b1
4
To demonstrate this point, let us assume the PUF has ðnÞ possible
challenge/response pairs, out of which Bob collects a subset of size ðkÞ in step a-1.
The probability of Bob obtaining ðc2 Þ, denoted as P, is the product of two
probabilities: the first is the probability of Bob recording (c2; r2 ) in step a-1, denoted
as P1 , and the second is the probability of Bob guessing c2 assuming it exists in his
database (DB), denoted as P2 ¼ 1k.
To compute P1 , the total number of possible subsets of size ðkÞ in a set of size
ðnÞ is calculated using the binomial coefficient as follows:
n n!
¼ ð6:6Þ
k ðn kÞ!k!
In the same manner, it possible to compute the number of subsets which can be
formed using all the challenges excluding c2 as follows (assuming all challenges are
equally probable)
n1 ðn 1Þ!
¼ ð6:7Þ
k ðn 1 kÞ!k!
The above two equations allow the computation of the probability that Bob may
have recorded c2 by chance, as given below:
ðn1Þ!
ðn1kÞ!k! nk
P1 ¼ n!
¼ ð6:8Þ
ðnkÞ!k!
n
n k 1 k=n
P ¼ P1 P2 ¼ ¼ ð6:9Þ
nk k
Let us give a numerical example on how to compute P.
Example 6.3 Calculate the probability of Bob guessing c2 , hence computing b1 as
result of the execution of the 1-of-2 OT protocol described above. It is assumed that
the PUF used has a total of n ¼ 232 challenge/response pairs, out of which k ¼ 220
is recorded by Bob in the setup stage of the protocol.
Solution:
Using Eq. (6.9), this probability can be computed as follows:
n k 232 220
P¼ ¼ 32 þ 20 ¼ 9:53 107
nk 2
6.5 Hardware-Assisted Cryptographic Protocols 207
(1) The BC-sender (Bob) acts an OT-receiver and uses his secret choice ðsÞ,
where s 2 f0; 1g, as an input to the PUF-OT protocol described above.
(2) The BC-receiver acts as an OT-sender and used his secret values ðb0 ; b1 Þ,
where b0 ; b1 2 f0; 1gc , as inputs to the OT protocol.
(3) The OT protocol is then run which allows Bob to learn one of Alice’s
secrets (i.e.bs ).
(1) Bob sends Alice the binary string (s; bs ) (the fact that Bob was able to
compute bs proves his previous commitment to the secret choice s).
208 6 Hardware-Based Security Applications of Physically Unclonable …
6.6.1 Motivation
The general concept of a wireless sensor network is that of deploying many small,
inconspicuous, self-contained sensor nodes into an environment to collect and
transmit information, and possibly provide localized actuation. Potential uses for
such networks include medical applications; structural monitoring of buildings;
status monitoring of machinery; environmental monitoring; military tracking; se-
curity; wearable computing; aircraft engine monitoring and personal tracking and
recovery systems [33–35].
Security is an important requirement in some of these applications such as
sensors used for remote health monitoring systems that handle private patients’
data. Other security-sensitive applications include nuclear and chemical material
tracking systems. Existing solutions for secure remote sensing rely on the use of a
cryptographic block that performs data encryption/authentication tasks as shown in
Fig. 6.9 [36].
This solution, however, suffers from two shortcomings. First, the data output of
the sensing element is not protected before it enters the cryptographic block, so in
principles, an adversary who can gain physical access to the sensor can
read/manipulate the sensor measurements using invasive physical attacks (e.g.
directly probing the internal signals) or non-invasive side channel analysis [37]. The
second disadvantage of the existing design is its reliance on the use of classic
cryptographic primitives such as symmetric ciphers or hash functions, which may
be prohibitively expensive or not even feasible in the case of resources-constrained
sensors [12].
The technology of physically unclonable functions provides an alternative
approach for designing secure sensors, which does not require a separate crypto-
graphic module. Several PUF-based secure sensing schemes have already been
proposed in the literature [15, 38, 39]. The basic working principles of a PUF-based
secure sensor will be explained in the following section.
where
vmin is the minimum supply voltage at which the PUF circuit can still generate an
output.
vmin þ ðk 1ÞDv is the maximum supply voltage that can be applied to the PUF
circuit without damaging it.
Dv is the minimum step change in the supply voltage needed to cause one-bit
change in the PUF response.
It is worth noting here that the minimum/maximum values of the voltage supply
are a function of the implementation technology; therefore, one should consider
technologies which allow a large range of operating voltages. For simplicity, we
assume that changes in the PUF responses due to ambient temperature are negli-
gible. The implementation procedure can now be outlined as follows:
(a) Enrolment
(1) For each PUF, a number of challenges are applied and their corre-
sponding responses are measured.
(2) The above experiment is repeated for a (k) different supply voltages.
(3) The server needs also to establish the mapping between the physical
quantity to be measured and the voltage supply applied to the PUF.
(4) For each PUF sensor, the central server creates a database that includes
the challenges, their responses at different supply voltages and the values
of the physical quantity corresponding to each voltage level. Exemplar
characterization data are shown in Table 6.5, which include the entries
for two challenges ðc0 ; c1 Þ applied to a PUF instance.
(b) Sensing
(1) The server transmits a challenge (c) from its database to a PUF sensor.
(2) The PUF generates a response ðr Þ, which is a function of the received
challenge and the physical quantity being measured, and sends it back to
the server.
(3) The server uses the received response to find the corresponding supply
voltage, which then mapped to the physical quantity being measured (air
pressure, chemical substance, etc.).
(4) The server may delete the used challenge in order to protect against
replay attacks.
This scheme assumes that only the server has access to PUF character-
ization data, so that responses can be transmitted without encryption.
6.6 Remote Secure Sensors 211
This section discusses the qualities a PUF design should have to be suitable for
secure sensing applications.
Let us consider the architecture shown in Fig. 6.10, it assumes that the change of
the PUF response to a certain challenge only reflects the purposely induced fluc-
tuations of the supply voltage ðDvÞ, hence the physical quantity being measured
(PQ). Such an assumption, however, ignores the fact that the PUF circuitry may
still be susceptible to natural variations in ambient parameters (e.g. temperature,
supply voltage) and to other sources of temporal noise, as discussed in Chap. 3.
Therefore, in order to ensure a PUF design is suitable, it needs to be made sensitive
to the purposely induced voltage supply change ðDvÞ, but resilient to all other forms
of noise. In order to evaluate the suitability of a PUF design for a secure sensor
application, the following two metrics will be used:
1. The InterPQ distance HDInterPQ which refers to the Hamming distance
between two PUF responses to the same challenge evaluated at two supply
voltages which differ by ðDvÞ but under
the same noise conditions.
2. The IntraPQ distance HDIntraPQ which refers to the Hamming distance
between two PUF responses to the same challenge evaluated at the same
nominal supply voltage but under different noise conditions.
Based on the above metrics, one can deduce the minimum requirements a PUF
needs to satisfy to be suitable for a sensor application as follows:
8rv1 ; rv2 2 Rv
minðHDInterPQ Þ [ 1 ð6:11Þ
212 6 Hardware-Based Security Applications of Physically Unclonable …
This condition basically means that the minimum Hamming distance between
PUF responses to the same challenge under different supply voltages should be
more than one, it should also be larger than the maximum number of response bit
flips caused by temporal noise conditions.
Another desirable quality in a PUF sensor is uniqueness, so that if one device is
compromised (e.g. its behaviour is learned by an adversary), the others remain
secure. Ideally, the uniqueness should be 50%.
Let us now give a numerical example to illustrate how the sensing usability
condition can be verified.
Example 6.4 Consider a PUF circuit that has two inputs and six outputs, wherein its
challenge/response pairs at various supply voltages is shown in Table 6.6. Explain
whether or not this circuit can be employed as a sensor capable of differentiating
between five distinct levels of the supply voltage. It is assumed in here that
Dv ¼ 0:1v, and the maximum change of response bits due to temporal noise is 2.
Solution:
To establish the suitability of the PUF for sensor applications, we refer to
Definition 6.1.
First, we need to compute the Hamming distance between each pair of responses
for each challenge.
For c ¼ 00, we have
minðHDInterPQ ðr0:8 ; r0:9 ÞÞ ¼ 0\1; which means the PUF is not sensitive to the
minimum step change of the supply voltage for this challenge; therefore, we need to
try other challenges.
For c ¼ 01, we have
HDInterPQ ðr0:8 ; r1 Þ ¼ 2
HDInterPQ ðr0:9 ; r1 Þ ¼ 4
This section provides an exemplar case study of a PUF sensor design using a 32-bit
arbiter PUF.
There are five main requirements that need to be defined at this stage:
(1) The minimum step change in the supply voltage Dv which is a function
of the resolution of the transcoder output.
(2) The number of distinct levels of the physical quantity to be measured (k).
(3) The upper bound of implementation costs (area, energy, etc.).
(4) False acceptance rates (FAR): the probability of generating a valid
response of the wrong PQ value, an example of such a case is when the
PUF generates the same response for two different PQ values.
(5) False rejection rate (FRR) is the probability of generating an invalid
response for a genuine PQ value, this is typically due to external noise.
FRR and FAR can be estimated using the HDInterPQ and HDIntraPQ as
follows [15]:
m
X
M i Mi
FRR ¼ 1 ^pIntraPQ 1^
pIntraPQ ð6:12Þ
i¼0
i
m
X
M i Mi
FAR ¼ ^pInterPQ 1^
pInterPQ ð6:13Þ
i¼0
i
214 6 Hardware-Based Security Applications of Physically Unclonable …
wherein
M is the number of PUF output bits (i.e. the bit length of its responses per
challenge).
m is the maximum number of noise-induced errors of a PUF response which
does not cause a false rejection, in other words, the allowed error margin.
^pIntraPQ ; ^pIntrePQ are the binomial probability estimators of HDIntraPQ and
HDInterPQ , respectively.
It is worth noting here that the above equations assume that HDIntraPQ and
HDInterPQ have binomial distributions.
Ideally, both FRR and FAR need to be minimised; however, Eqs. (6.12) and
(6.13) indicate that in order to minimise FRR, m need to be as large as possible so
the bit flips caused by temporal noise do not lead to rejecting a genuine response.
On the other hand, increasing m can aggravate FAR, because it increases the
probability of obtaining the same response from different physical quantities.
Therefore, in practice, m is chosen to balance FRR and FAR, in other words, to
make them equal. This value of m is referred to as the equal error threshold (mEER ).
For discreet distributions, it may not be possible to find such a value, instead mEER
is computed as follows:
The goal of this stage is to find the best implementation of the PUF such as all
design specifications are met. The arbiter PUF has been chosen due to the relative
ease of its design. Figure 6.11 shows an illustration of an arbiter PUF with 32
inputs (N = 32) and one output (M = 1), it consists of two delay paths that have the
same layout, hence nominal latency. The selection inputs (i.e. the challenges) create
different configurations of these two paths, for each challenge, the delay of the
signal (X) through the chosen two paths is compared to produce a response (Y).
6.6 Remote Secure Sensors 215
The results outlined in Table 6.7 indicate that the best solution out of the con-
sidered three options is N ¼ 32; M ¼ 64.
Table 6.7 The impact of response bit length on the FAR/FRR of an arbiter PUF-based sensor
N M mEER FAR FAR
4
32 32 7 5:3 10 5:5 104
5
32 64 7 1:01 10 1:05 105
6
32 128 8 8:6 10 9:7 106
216 6 Hardware-Based Security Applications of Physically Unclonable …
1. The design house embeds a lock mechanism at the register transfer level
(RTL) level. There a number of approaches to design such a lock; in the case of
sequential digital circuits, this can be done by adding non-functional states to the
original state machines. Take, for example, the state machine shown in
Fig. 6.12, the grey states represent the actual design, and the black states rep-
resent the redundant non-functional states. These additional states serve two
purposes: first they obfuscate the original functionality of the design, hence
making it harder to understand and copy. Second, they can be used to lock the
design in a non-functional state, such that only a designer or an authorised party
has the knowledge to bring the design into a functional state by applying the
correct sequence of inputs.
2. The design house embeds a PUF in each chip as shown in Fig. 6.13, the outputs
of the PUF are used to initialize the internal flip-flops of the design (i.e. place the
design in an initial state). The output of the PUF is random, and therefore, all the
states in the design can become an initial state with equal probabilities; there-
fore, in order to increase the probability that a design starts in a non-functional
state (e.g. a black state in Fig. 6.12), the number of the redundant states should
be significantly larger than the number of functional states (e.g. the grey states in
Fig. 6.12). This does mean there will be a large area overhead, but this extra cost
is the price to pay to ensure post-fabrication control and limit illegal chip
overproduction.
In addition to the extra flip-flops incurred by adding the extra states, the increase
in area is caused by the additional multiplexer and the PUF circuits. However,
6.7 Anti-Counterfeiting Techniques 217
Fig. 6.13 Circuit architecture of a PUF-initialised synchronous sequential logic based on mealy
state machine
218 6 Hardware-Based Security Applications of Physically Unclonable …
this can be reduced by reusing some existing functional blocks such as the
multiplexers used for on-chip testing purposes (e.g. scan path), which each chip
normally has.
3. The design house sends the post layout files (e.g. GDSI files) to the fabrication
facilities, these files are typically in a non-readable format, along with a specific
challenge for the PUF.
4. Once the chips are fabricated, the manufacture will read-out the response of the
PUF to the designer challenge from each device and send these back to the
design house.
5. The design house computes a key to unlock each chip based on its PUF
response and sends it back to the manufacturer.
6. For each chip (see Fig. 6.13), the manufacture powers-up the device, applies the
received challenge to the PUF, set (M = 1) and applies one clock cycle. This
will set the design in the PUF-generated initial state.
7. The manufacturers then sets (M = 0) and then applies the key to the primary
inputs and clock the design. This should drive the design into one of the
functional states.
8. The chip is now ready for testing.
For chip unlocking, the above procedure needs only to be applied once to
provide protection against chip overproduction; therefore, the designer needs to
ensure that the design can never go back to a locked state after it has been unlocked.
One way to achieve this is to store the uncloaking key for each device in a
non-volatile memory on chip, such that each time the device is powered-up, the
internal state is automatically initialised by the PUF to lock the design; then, the
stored key is automatically applied to unlock it. In this case, there will be no need to
protect encrypt or protect the unlocking key for each chip, this is because each
device will have its own key which is useless to use on other devices.
It is worth noting here that the above approach assumes the PUF is able to
produce the same response consistently, which implies there may be a need for
additional reliability-enchantment techniques as discussed in Chap. 4 such as error
correction codes and stable-bit selection.
We are now going to give an illustrative example using a sequential circuit.
Example 6.5 The state machine in Fig. 6.14 represents a synchronous sequence
detector, and it has one output and one input in addition to the clock input. It works
as follows: the output will generate a pulse every time the following 4-bit sequence
of input is applied (0111) serially on its input.
(1) How many flip-flops you need to represent all the state in this design?
(2) Devise a new state machine which obfuscates the original functionality of
this design, such as the total number of additional flip-flop needed does not
exceed 2.
6.7 Anti-Counterfeiting Techniques 219
Solution:
(1) This design has four states; therefore, two flip-flops are needed to represent all
the states.
(2) Figure 6.15 shows an exemplar obfuscated design for the same circuit, we
have added a total of 12 redundant states, so in this case, four flip-flops will be
needed to represent the internal states. It is worth noting that in this solution
the output of the circuit remains low if the design is in one of the
non-functional states (S4–S15). In this, case the probability of the design
starting accidentally in a functional state is 25%, which is still very high. To
make the design more secure, one needs to add more flip-flops.
Example 6.6 The obfuscated state machine in Fig. 6.15 is adopted to develop
circuit architecture for a PUF-initialised synchronous sequence detector as shown in
Fig. 6.13. Binary coding is used to represent each state (e.g. S1 is coded as 0001,
S2 as 0011, etc.)
220 6 Hardware-Based Security Applications of Physically Unclonable …
(1) The PUF needs to have a 4-bit response length in order to initialise the
state-holding flip-flops.
(2) For Device 1: the PUF response is 1111, which means the design is going to
be initialised to the non-functional state (S15). Looking at the state machine in
Fig. 6.15, the following sequence of input needs to be applied from left to
right 1010, this can be used as the unlocking key.
To clarify, the first bit of this sequence brings the design from S15 to S11, the
second bit of this sequence brings the design from S11 to S7, the third bit of this
sequence brings the design from S7 to S6, and the last bit of this sequence brings
the design from S6 to S0.
The unlocking keys for the other two devices can be computed in the same
manner.
6.8 Tamper-Proof Design 221
6.9 Conclusions
latter modulates the power supply of the PUF circuit. Such usage requires the
PUF design to be capable of functioning correctly in a range of supply voltages
and demonstrating distinctively measurable challenge/response behaviour in
each case. Ongoing work in this area focuses on developing PUF architectures
which can provide correct measurement in spite of temporal noise and normal
fluctuation in operating conditions, in other words: how to design a PUF circuit
that is sensitive to purposely induced power supply fluctuations and resilient to
temperature and voltage variations?
• PUF technology is increasingly adopted in advanced security protocols such as
key exchange schemes and oblivious transfer techniques.
• Other PUF applications include Anti-Counterfeiting ICs, intellectual property
protection, and software licensing.
• In summary, PUF technology has a great commercial potential, but there are still
numerous design challenges need overcoming before such potential can be
converted into actual products.
6.10 Problems
(1) A single PUF is used to generate a 128-bit encryption key, wherein a BCH
code (n, k, t) = (15, 7, 2) is employed to generate the helper data.
(a) How many raw PUF response bits are needed assuming the required
secrecy rate is 0.9?
(b) How many challenges would you need to apply to generate those raw bits
given the PUF has a 32-bit response length?
(2) The challenge/response behaviour of a PUF is given in Table 6.8. The design
is used in the authentication protocol depicted in Fig. 6.4. The PUF is
expected to operate under six different environment conditions (i.e. degree of
power supply fluctuations and range of ambient temperature). The nominal
conditions at which the PUF was enrolled at are (25 °C, V = 1v) It is assumed
all challenges listed in Table 6.8 have equal probabilities, and all the listed
environment conditions are equally probable.
(a) What is the best threshold value that minimises the probability of a denial
of service?
(b) What is the best threshold value that minimises the probability of a
forged PUF being authenticated?
(c) What is the authentication threshold value (t) that produces the minimum
equal error rate (EERÞ?
(a) What is the minimum number of raw PUF bits needed to obtain this key
assuming the secrecy rate is 0.85?
(b) Table 6.10 gives two possible error correction codes with their associated
area. Compute the number of helper data needed by each code if it is to
be used as part of the key generation scheme.
(c) Which of the codes listed in Table 6.10 lead to a smaller area overhead of
the overall key generation scheme?
(6) Which of the following attacks are the most likely threat to the PUF-based
oblivious transfer
(7) The false rejection/admission rates of a basic PUF authentication protocol (as
shown in Sect. 6.4.3) are 1.2 10−6 and 2.3 10−6, respectively.
A code-offset scheme is employed to improve the reliability of the used PUF
design. How does the use of this scheme affect the false rejection/admission
rates?
(8) A ring oscillator PUF is used in a secure sensing application (as shown in
Sect. 6.6.2), wherein the false rejection/admission rates are 3.5 10−6 and
4.9 10−6, respectively.
(9) The state machine in Fig. 6.14 represents a synchronous sequence detector,
and it has one output and one input in addition to the clock input. It works as
follows: the output will generate a pulse every time the following 4-bit
sequence of input is applied (0111) serially on its input.
(a) Devise a new state machine which obfuscates the original functionality of
this design, such that the total number of additional state-holding
flip-flops needed is 3.
(b) What is the probability of the design accidentally starting in a functional
state?
(c) A PUF with a 5-bit response length is used to initialise the design.
Compute the unlocking key of the following devices based on your
solution for question 9-b.
References 225
References
1. P. Stewin, I. Bystrov, Understanding DMA Malware, ed. by U. Flegel, E. Markatos,
W. Robertson. Revised Selected Papers Detection of Intrusions and Malware, and
Vulnerability Assessment: 9th International Conference, DIMVA 2012, Heraklion, Crete,
Greece, July 26–27, 2012, (Berlin: Springer, 2013), pp. 21–41
2. S. Skorobogatov, in Data remanence in flash memory devices. Presented at the Proceedings of
the 7th International Conference on Cryptographic Hardware and Embedded Systems,
Edinburgh, UK, 2005
3. J.A. Halderman, S.D. Schoen, N. Heninger, W. Clarkson, W. Paul, J.A. Calandrino et al., Lest
we remember: cold-boot attacks on encryption keys. Commun. ACM 52, 91–98 (2009)
4. S.K. Mathew, S.K. Satpathy, M.A. Anders, H. Kaul, S.K. Hsu, A. Agarwal, et al., 16.2 A
0.19pJ/b PVT-variation-tolerant hybrid physically unclonable function circuit for 100% stable
secure key generation in 22 nm CMOS in 2014 IEEE International Solid-State Circuits
Conference Digest of Technical Papers (ISSCC), (2014), pp. 278–279
5. M.T. Rahman, F. Rahman, D. Forte, M. Tehranipoor, An aging-resistant RO-PUF for reliable
key generation. IEEE Trans. Emerg. Top. Comput. 4, 335–348 (2016)
6. Z. Paral, S. Devadas, in Reliable and efficient PUF-based key generation using pattern
matching. 2011 IEEE International Symposium on Hardware-Oriented Security and Trust,
2011, pp. 128–133
7. R. Maes, Physically unclonable functions: constructions, properties and applications
(Springer, Berlin, 2013)
8. Development and Education Board (2017), available: https://www.altera.com/solutions/
partners/partner-profile/terasic-inc-/board/altera-de2-115-development-and-education-board.
html
9. S.S.K.J. Guajardo, G.-J. Schrijen, P. Tuyls, in FPGA intrinsic PUFs and their use for IP
protection. International Conference on Cryptographic Hardware and Embedded Systems,
pp. 63–80, 2007
10. T. Ignatenko, G.J. Schrijen, B. Skoric, P. Tuyls, F. Willems, in Estimating the secrecy-rate of
physical unclonable functions with the context-tree weighting method. 2006 IEEE Interna-
tional Symposium on Information Theory, 2006, pp. 499–503
11. Y. Dodis, L. Reyzin, A. Smith, in Fuzzy extractors: how to generate strong keys from
biometrics and other noisy data, ed. by C. Cachin and J. L. Camenisch. Proceedings on
Advances in Cryptology—EUROCRYPT 2004: International Conference on the Theory and
Applications of Cryptographic Techniques, Interlaken, Switzerland, May 2–6, 2004, (Berlin:
Springer, 2004), pp. 523–540
12. S. Satpathy, S. Mathew, V. Suresh, R. Krishnamurthy, in Ultra-low energy security circuits
for IoT applications. 2016 IEEE 34th International Conference on Computer Design (ICCD),
(2016), pp. 682–685
13. T. Xu, J. B. Wendt, M. Potkonjak, in Security of IoT systems: design challenges and
opportunities. 2014 IEEE/ACM International Conference on Computer-Aided Design
(ICCAD), (2014), pp. 417–423
14. B. Halak, M. Zwolinski and M. S. Mispan, Overview of PUF-based hardware security
solutions for the internet of things, 2016 IEEE 59th International Midwest Symposium on
Circuits and Systems (MWSCAS), Abu Dhabi, 2016, pp. 1–4. doi: 10.1109/MWSCAS.2016.
7870046
15. H.M.Y. Gao, D. Abbott, S.F. Al-Sarawi, PUF sensor: exploiting PUF unreliability for secure
wireless sensing. IEEE Trans. Circuits Syst. I Regul. Pap. 64, 2532–2543 (2017)
16. M. Majzoobi, M. Rostami, F. Koushanfar, D. S. Wallach, S. Devadas, in Slender PUF
protocol: a lightweight, robust, and secure authentication by substring matching. 2012 IEEE
Symposium on Security and Privacy Workshops, (2012), pp. 33–44
17. Y. Gao, G. Li, H. Ma, S. F. Al-Sarawi, O. Kavehei, D. Abbott, et al., in Obfuscated
challenge-response: a secure lightweight authentication mechanism for PUF-based pervasive
226 6 Hardware-Based Security Applications of Physically Unclonable …
36. M. Conti, in Secure Wireless Sensor Networks: Threats and Solutions. Springer Publishing
Company, Incorporated, 2015
37. T.F.E. Diehl, in Copy watermark: closing the analog hole. Proceedings of IEEE International
Conference on Consumer Electronics (2003), pp. 52–53
38. Y.G.H. Ma, O. Kavehei, D. C. Ranasinghe, in A PUF sensor: Securing physical
measurements. IEEE International Conference on Pervasive Computing and Communications
Workshops (PerCom Workshops), (Kona, HI, 2017), pp. 648–653
39. E.G.a.R.K.K. Rosenfeld, in Sensor physical unclonable functions. IEEE International
Symposium on Hardware-Oriented Security and Trust (HOST) (Anaheim, CA), pp. 112–117
40. U. Guin, K. Huang, D. DiMase, J.M. Carulli, M. Tehranipoor, Y. Makris, Counterfeit
integrated circuits: a rising threat in the global semiconductor supply chain. Proc. IEEE 102,
1207–1228 (2014)
41. A. Yousra, K. Farinaz, P. Miodrag, Remote activation of ICs for piracy prevention and digital
right management. IEEE/ACM Int. Conf. Comput. Aided Design 2007, 674–677 (2007)
42. P. Tuyls, L. Batina, in RFID-Tags for Anti-counterfeiting, ed. by D. Pointcheval. Proceedings
on Topics in Cryptology—CT-RSA 2006: The Cryptographers’ Track at the RSA Conference
2006, San Jose, CA, USA, February 13–17, 2005 (Berlin: Springer, 2006), pp. 115–131
Appendix A
The design described in this section is based on the PUF arbiter architecture shown
in Chap. 2, Fig. 2.8. An exemplar description of System Verilog of this design is
provided in Fig. A.1 flowed by the description of the design of NAND latch in
Fig. A.2.
230 Appendix A
B.1 Introduction
The following script creates an ideal instance of 16-bit arbiter PUF. This model is
constructed using uniformly distributed random numbers as the delay parameters
for switching components.
234 Appendix B
Appendix B 235
Several functions are needed to run the above script successfully, as below:
236 Appendix B
Appendix B 237
To create multiple instances (e.g. 30) of the design, one can set the parameter
inst=30 in TopLevel_ArbiterPUF.m script. This can help emulate the silicon
implementations of this PUF on different devices.
2 X
k1 X k
HD Ri ðnÞ; Rj ðnÞ
HDINTER ¼ 100%
kðk 1Þ i¼1 j¼i þ 1 n
238 Appendix B
The reliability of the PUF determines how reliable the response of PUF given the
same challenge at different ambient temperatures and/or voltage supply
fluctuations. The Hamming distance (HD) is used to evaluate the reliability
performance and is called the ‘Intra-chip HD’. If a single chip, represented as i, has
n-bit reference response Ri(n) from the chip i at normal operating conditions (at
room temperature using the normal supply voltage) and the same n-bit response
obtained at different conditions R′i (n), respectively, for the challenge C, the
average intra-chip HD for k samples/chips is defined as
1X k
HD Ri ðnÞ; R0i ðnÞ
HDINTRA ¼ 100%
k i¼1 n
This is an exemplar script which can be used for developing support vector machine
modelling attacks on an arbiter PUF. It uses the feature transformation function
arbiterFT below.
242 Appendix C
Appendix C 243
244 Appendix C
This is an exemplar script which can be used for developing artificial neural
network algorithm modelling attacks on a PUF. It does not use feature
transformation.
Appendix C 245
246 Appendix C
Appendix C 247
Index
C K
Countermeasures, 9, 132, 171, 174 Key Exchange (KE) Schemes, 202, 203
Crosstalk noise, 62
Cryptographic keys, 62, 179, 183 M
Machine learning algorithms, 131, 132, 137,
E 144, 145, 157, 160, 163, 167, 177, 199,
Electromigration, 54, 58 224
250 Index