0% found this document useful (0 votes)
25 views11 pages

10.1108@cw 04 2019 0039

Uploaded by

Mouna Bedoui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views11 pages

10.1108@cw 04 2019 0039

Uploaded by

Mouna Bedoui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Design and implementation of power and area

optimized AES architecture on FPGA for


IoT application
Rajasekar P.
Narayana Engineering College, Gudur, Nellore, India, and
Mangalam H.
Department of Electronics and Communication Engineering, Sri Ramakrishna Institute of Technology, Coimbatore, India

Abstract
Purpose – The growing trends in the usage of hand held devices necessitate the need to design them with low power consumption and less area
design. Besides, information security is gaining enormous importance in information transmission and data storage technology. In addition, today’s
technology world is connected, communicated and controlled via the Internet of Things (IoT). In many applications, the most standard and widely
used cryptography algorithm for providing security is Advanced Encryption Standard (AES). This paper aims to design an efficient model of AES
cryptography for low power and less area.
Design/methodology/approach – First, the main issues related to less area and low power consumption in the AES encryption core are addressed.
To implement optimized AES core, the authors proposed optimized multiplicative inverse, affine transforms and Xtime multipliers functions, which
are the core function of AES’s core. In addition, to achieve the high throughput, it uses the multistage pipeline and resource reuse architectures for
SBox and Mixcolumn of AES.
Findings – The results of optimized AES architecture have revealed that the multistage pipe line and resource sharing are optimal design model in
Field Programmable Gate Array (FPGA) implementation. It could provide high security with low power and area for IoT and wireless sensors
networks.
Originality/value – This proposed optimized modified architecture has been implemented in FPGA to calculate the power, area and delay
parameters. This multistage pipeline and resource sharing have promised to minimize the area and power.
Keywords AES encryption/decryption, Low power Architecture, IoT, FPGA implementation, AES SBox, AES Mixcolumn
Paper type Research paper

1. Introduction requires the low power and area optimized design of security
algorithms (Gunathilake et al., 2019).
There has been extensive research into the construction of Cryptography is the art of science that converts the readable
secure Wireless Sensor Networks and Internet of Things (IoT) plain data into the unreadable data based on the key, namely,
devices. These network structures are becoming more popular,
symmetry key (private key) and asymmetry key (public key).
as they are operated with battery power and can be used in
Several cryptography algorithms have been developed that
remote locations and risky environments. This needs the
could be used for protecting highly sensitive data. In the
attention of the power consumption of IoT devices. In recent
meantime, most of the cryptography algorithms have been
years, IoT devices have rapidly increased compared to the
broken because of the advanced computation process and
personal computer and mobile devices. According to the
industry estimation, nearly more than 50 billion of devices have cryptanalysis methods. The most secured algorithm DES
been connected via IoT. It could be possible to steal the data, as (Data Encryption Standard) had been broken in June 1997 by
these huge amount of devices have been connected via an open DESCHAL (DES Challenge) project (Curtin, 2005; Verser,
network. Moreover, the device may be controlled or hacked by 2020). In 1998, the National Institute of Standards and
third parties, if it does not have any security algorithm (Huang Technology, USA had called the researchers to submit the new
et al., 2007). Under these circumstances, security plays a vital efficient algorithm that could replace DES. After the various
role in IoT transmission. Besides that, IoT devices have area scrutiny processes, the Rijndael algorithm had been chosen as
restriction and power constraints in the field; henceforth, it the best security algorithm, named as AES that had replaced
the DES/Triple DES in October 2000 (FIPS PUB, 2020).

The current issue and full text archive of this journal is available on Emerald The authors would like to acknowledge their colleges Narayana
Insight at: https://www.emerald.com/insight/0305-6120.htm Engineering College, Gudur and Sri Ramakrishna Institute of Technology,
Coimbatore for the support.

Received 26 April 2019


Circuit World Revised 20 December 2019
© Emerald Publishing Limited [ISSN 0305-6120] 14 March 2020
[DOI 10.1108/CW-04-2019-0039] Accepted 19 May 2020
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

Thus, this paper has focused on the optimized AES algorithm Combined AES encryption and decryption were discussed by
implementation for IoT devices that lead to increase in the Rao and Sharma (2017) to reduce the area of the crypto core.
lifetime of IoT devices and resolve the security issues in data The literature survey has revealed that memory-based
transmission (Gunathilake et al., 2019). architecture and combinational logic model consume more
The current IoT protocols such as IEEE 802.15.4, LoraWan power as well as occupy more area whereas a composite field
and Sigfox adopt AES algorithm for secure data transmission. such as Galois field model consume very less power. Hence, the
So, this paper discusses an optimized AES encryption proposed AES architecture model has been implemented by
architecture for an IoT application. To validate the using the composite field with modified pipe-line stages and
optimization of the area and the power model of AES-SBox resource sharing.
architecture, Xilinx Spartan 3E, Spartan 6, Vertex 6 and Virtex
7 devices are used. Power consumption has been measured by 3. Advanced Encryption Standard encryption &
the XPower analyzer tool in ISE 14.1 design suite. This decryption
proposed design has resulted in low power and less area with
fewer hardware redundancies. AES is a private key cipher algorithm carried out by 128-bit
This paper is organized as follows. Section 2 describes the block module that uses different key length, 128, 192 and 256
literature survey. Section 3 explains the AES operation. Section based on rounds 10, 12, 14, respectively. Each round is divided
4 discusses the proposed AES hardware model. Section 5 gives into four functional modules; AddRoundKey (ARK), SubByte
the results and discussion. Section 6 provides the conclusion. Transformation (SBOX), MixColumn Transformation
(MCT) and ShiftRow (SR). During decryption,
AddRoundKey Inverse SubByte transformation, Inverse
2. Literature survey
MixColumn transformation and Inverse ShftRow are used.
AES can be implemented by using various design architectural The flow diagram of AES encryption and decryption is shown
models that could be used to optimize the power, area and in Figure 1. For both operations, key expansion units are used
speed. As noted in Farhan et al. (2004), Hamalainen et al. to generate different keys for different rounds (Stallings, 2017;
(2005), Good and Benaissa (2005), Rejeb et al. (2006), FIPS PUB, 2020).
memory-based architecture designs take less design time and SBox and Mixcolumn are the most power consumption
less complex design strategy. However, it occupies more area, modules because of their nonlinear transformation and matrix
consumes more power and requires more access time. The multiplication. To perform the transformation operation, 8 bit
authors, Good and Benaissa (2005), Tillich et al. (2006), input is mapped into another 8-bit based on affine transform
Fischer and Drutarovský (2001) discussed in detail about the and multiplicative inverse model. In Mixcolumn, 4  4 by
memory-based design. matrix input is multiplied by different distance multipliers
Different memory-based designs were discussed in Lookup 01,01,02 and 03 for forwarding Mixcolumn and multiplied by
Table (LUT) method by Farhan et al. (2004), Hamalainen et al. 09, 0B, 0 D and 0E for inverse Mixcolumn. The Shiftrow and
(2005), Good and Benaissa (2005), Verbauwhede et al. (2003), Addround key are the additional operations that shuffle the
Hodjat and Verbauwhede (2004) and embedded design by Rejeb data. The shift row operation is carried out row by row, by
et al. (2006). Further, basic digital circuits like encoder, decoder changing the position of each byte in each row and add round
structures were discussed by Fischer and Drutarovský (2001), key uses Exclusive OR (XOR) operation of Mixcolumn output
and reconfigurable design by Bertoni et al. (2004), Zhang et al. and key expansion unit.
(2007) to reduce the complexity of the design. For each round of operation, key expansion unit separately
In addition to that, a data path scheme (Huang et al., 2007), generates the key value by using Subword(), Rootword(), and
content addressable memory (Fan and Hwang, 2008), prime RCon(i) functions, where i is the round number. In key
number-based design (Rais and Qasim, 2009) and expansion unit, pre-assigned Rcon(i) unit computes some
LinearFeedback Shift Register (LFSR)-based design (Das, intermediate results, whereas subword operation same as SBox
2014) were discussed to minimize the area and reduce the and RotWord() uses a cyclic permutation (FIPS PUB, 2020;
critical path delay. General architectures can be used to Stallings, 2017). To strengthen the security, different key
implement the SBox and MCT (Satoh et al., 2001; length and different modes of operation can be adopted. The
Verbauwhede et al., 2003). choice of AES modes are based on complexity, authentication
Besides, composite field arithmetic operation (Satoh et al., factors, and implementation issues. AES can be implemented
2001; Yu, 2005) was discussed in reducing the circuit complexity with the simple Electronic Code Book (ECB) to an improved
that had yielded reduction in power. That design took the trade- security mode Counter mode (CTR). Some other modes are
off between speed and area. High-performance data encryption CBC, CFB, and OFB for enhancing the security and
standard AES has been implemented by Chen et al. (2019) using minimizing the cryptanalysis attack (Stallings, 2017). In ECB,
full pipeline and completely unroll architecture that used the a whole input plaintext is divided into 128-bit data block and
BRAM and distributed RAM model. Mixcolumn parallelism operation is carried out separately. It does not have any data
architecture has been elaborated by Neelima and Brindha (2018) dependence between the consecutive blocks and can be
to reduce the delay of execution. Composite filed arithmetic processed in parallel. There is no error propagation to the
SBOx was discussed by Gaded and Deshpande (2019) to subsequent block of data (FIPS PUB, 2020; Stallings, 2017).
improve the area and delay performance. Composite field To implement other modes of operation, such as CBC, CFB,
arithmetic AES SBox, Pipelined SBox, direct compute SBox and OFB, an Initial Vector (IV) is used as input to the encryption
LFSR-based SBox were discussed by Wong et al. (2018). block m0, then the cipher block c0 of the m0 becomes the IV to
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

Figure 1 Flow diagram of AES encryption and decryption

the encryption of the m1 block and so on. This method suffers is carried out by assigning as “0” and “1” for encryption and
throughput degradation because of the data dependency decryption respectively. Meanwhile, the key expansion module
between consecutive blocks. generates the key for 10 rounds, which can be used for both
This proposed AES cipher can support the common ECB encryption and decryption. To minimize the power and area in
mode that restricts the error propagation from one block to proposed AES architecture, a modified SBox and MixColumn
another one. Since it has error restriction, it can be used in are used that reduce the nonlinear power dissipation by
financial transaction-based applications over the channel. adopting resource sharing and multistage pipeline techniques.
This paper analyses the 128-bit AES encryption of ECB. In the
4. Proposed hardware implementation of IoT device, at the security layer, this modified AES structure
Advanced Encryption Standard can be implemented. Figure 3 shows the flow diagram of ECB
mode encryption. The subsequent sections elaborate on the
A combined AES encoder and decoder core architecture is
proposed architecture module of SBox and MixColumn.
shown in Figure 2, which is used to implement the overall
crypto processor. In this proposed design, as mentioned earlier,
it mainly focuses on power reduction as well as area 4.1 Proposed SBox and inverse SBox architecture
minimization. For this, a modified SBox and MixColumn are The non-linear operation of the SBox operates independently
used in encryption and inverse SBox and inverse MixColumn on each byte of state using SBox table. SBox can be constructed
in decryption. using either a lookup table or composite field arithmetic. As
The operation of AES encryption or decryption is controlled mentioned earlier, the composite field is optimized model.
by the single-bit mode selection key, Enrc/Decr. The operation Hence, in the proposed AES- ECB, the composite field

Figure 2 AES encryption and decryption core module Figure 3 Flow diagram of ECB mode encryption
Plain Text
128 Bit Plain Text
128 Bit
Encr/
Decr Controller/ 1 Bit
Encryption Module
Mode Selection Chiper Text
128 Bit
128 Bit
Block chiper
Key Key
128 Bit Encryption Module
Plain Text
Key Expansion 128 Bit
Decryption Module
Module

Cipher Text
Key
128 Bit
Chiper Text
128 Bit
128 Bit
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

arithmetic method is used for affine and multiplicative inverse The existing affine transform is shown in equation (3) and
function. In composite fields, bits are represented as simplified model of proposed affine transform in equation (4).
polynomial elements which are in an irreducible factor of power
AT ¼ bi  bðði 1 4Þmod 8Þ  bðði 1 5Þmod 8Þ  bðði 1 6Þmod 8Þ
8. For example, the instance of bit {10001011}2 is represented
as the polynomial of q7 1 q3 1 q 1 1 in Galois Field (GF)(28).  bðði 1 7Þmod 8Þ d
To perform the simplification that adopts the resource reuse as (4)
well as resource sharing; the given polynomial is sliced into the
Most Significant Bit (MSB) nibble and Least Significant Bit where d = {01100011}, i = 0 to 7, and b is the input of SBox
(LSB) nibble. Thus, element in GF(28) may be represented as Similarly, the inverse affine transform of existing model and
(bx1c)where b is the MSB nibble and c is the LSB nibble. simplified model equations are shown in equations (5) and (6):
Hence, the multiplicative inverse can be constructed using 2 3 2 3 2 3
0 0 1 0 0 1 0 1 b0 1
equation (1) which is discussed in Verbauwhede et al. (2003), 6 7 6 7 6 7
6 1 0 0 1 0 0 1 0 7 6 b1 7 6 0 7
Chodowiec and Gaj (2003), Rijmen (2000) and substituting A = 6 7 6 7 6 7
6 7 6 7 6 7
1 and B=l . The multiplicative inverse is sliced into MSB and 6 0 1 0 0 1 0 0 1 7 6 b2 7 6 1 7
6 7 6 7 6 7
LSB and SBox is implemented using the inner round pipeline 6 7 6 7 6 7
6 1 0 1 0 0 1 0 0 7 6 b3 7 6 0 7
and resource sharing as shown in Figure 4. These methods AT ¼ 6
1 7 6 7 6 7
6 0 1 0 1 0 0 1 0 7 6 b 7 1 6 0 7
minimize the power, as well as area by utilizing resource sharing 6 7 6 47 6 7
6 7 6 7 6 7
(Shanthini et al., 2014). Here, some intermediate results have 60 0 1 0 1 0 0 17 6b 7 617
6 7 6 57 6 7
been reused in succeeding stages: 6 7 6 7 6 7
6 1 0 0 1 0 1 0 0 7 6 b6 7 6 1 7
4 5 4 5 4 5
1 1
ðbx 1 cÞ1 ¼ bðb2 B 1 bcA 1 c2 Þ x 1 ðc 1 bAÞðb2 B 1 bcA 1 c2 Þ 0 1 0 0 1 0 1 1 b7 0
(1) (5)

where A = 1,B=l so that the equation (1) becomes: AT 1 ¼ bðði 1 2Þmod (6)
8Þ bðði 1 5Þmod 8Þ bðði 1 7Þmod 8 di
1 2 Þ1 2 Þ1
ðbx 1 cÞ ¼ bðb2 1 bc 1 c x 1 ðc 1 bÞðb2 1 bc 1 c
After the affine transformation, the isomorphic function is used
(2) as mapping model of GF(28). The elements of GF (28) are
mapped into composite field representation by using an
Affine and inverse affine transforms are used before and after
isomorphic function (d ).After performing the multiplicative
the multiplicative inverse operation of Sbox. This
inversion, it must be mapped to its equivalent GF (28) by using
transformation is used because to perform the composite field
the inverse isomorphic function d 1. Let “b” be the element in
arithmetic, the input data has to be properly mapped within it;
GF(28), the isomorphic function can be represented as 8x8
the affine and inverse affine is introduced in the multiplicative
matrix, where b7 is the most significant bit, b0 is the least
inverse model. Besides, to reduce the complexity as well as
power and area, the iterative method has been used (Rajasekar significant bit. The isomorphic function d in existing methods
and Mangalam, 2016a, 2016b): is given by equation (7) (Rashidi and Rashidi, 2013; Rajasekar
2 3 2 3 2 3 and Mangalam, 2015). The equation of d in the proposed
1 0 0 0 1 1 1 1 b0 1 method is given by equation (8):
6 7 6 7 6 7 0 1
6 1 1 0 0 0 1 1 1 7 6 b1 7 6 1 7 b7  b5
6 7 6 7 6 7 B C
6 7 6 7 6 7 B b7  b6  b4 b3 b2  b1 ; C
6 1 1 1 0 0 0 1 1 7 6 b2 7 6 0 7 B C
6 7 6 7 6 7 B C
6 7 6 7 6 7 B C
6 1 1 1 1 0 0 0 1 7 6 b3 7 6 0 7 b7  b5  b3 b2
AT ¼ 6 6 7 6 7 1 6 7 B C
7 6 7 6 7 B C
6 1 1 1 1 1 0 0 0 7 6 b4 7 6 0 7 B b7  b5  b3  b2 b1 ; C
6
60 1 1 1 1 1 0 07 6b 7 617
7 6 7 6 7 d ¼ BB
C
C (7)
6 7 6 57 6 7 B b7  b6  b2  b1 ; C
6 7 6 7 6 7 B C
6 0 0 1 1 1 1 1 0 7 6 b6 7 6 1 7 B b b b b b C
4 5 4 5 4 5 B 7 4 3 2 1 ; C
B C
B b6  b4  b3 C
0 0 0 1 1 1 1 1 b7 0 @ A
(3) b6  b4  b0

Figure 4 SBox implementation using inner round pipeline and resource sharing

LSB byte
GF
Operation
8 Bit Composite
Arithmetic Inverse Affine
Isomorphic 8 Bit
GF Isomorphic Transform
operation S2

MSB byte
S3 S4
S1 GF
Operation
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.
0 1
ism1 The equation (9) is as the general format of forward
B C
B ism2  ism3  b4  b3 C MixColumn operation in the matrix format where:
B C
B C
B ism1  b3  b2 C wC0 ¼ 02  xC0  03  xC1  01  xC2  01  xC3
B C
B C ¼ 02  xC0  ð02  xC1  01Þ  xC1 Þ  01  xC2  01  xC3 ;
B ism1  b3  ism3 C
d ¼ B
B
C
C (8) (10)
B ism2  ism3 C
B C
B b  b  b  ism C
B 7 4 3 3 C The simplified format of resource reuse equation is as follows:
B C
B ism4  b3 C
@ A wc0 ¼ 02  ½xC0  xC1   xC0  tmp ; (11)
ism4  b0
where tmp is xC1xC2  xC3  xC0; and “” is XOR operation
The modified isomorphic and inverse isomorphic functions use The overall forward MCT operation in shown in Figure 6.
the partial product in the recursive model, which uses the reuse of Similarly, Wc1, Wc2, and Wc3 can be obtained by using the
multiplication module to minimize the XOR. Here isimi gives the equation (11). To minimize resource utilization and power
intermediate results which are reused in the recursive model. consumption, it groups common operations and combines a
As explained earlier, the SBox module function is redesigned resource sharing method.
in proposed architecture that leads to reduction in area as well Further, to optimize the inverse Mix column, let us consider
as power because of resource sharing, pipeline and iterative the general equation (12):
operation. The power, area and delay minimization has been
2 0 3 2 3 2 3
compared with the various types of architecture model as well wc 0 0E 0B 0D 09 xc0
as the proposed design model. The detailed discussion is in 6 0 7 6 7 6 7
6 w 1 7 6 09 0E 0B 0D 7 6 xc 7
section 5. 6 c 7 6 7 6 1 7
6 0 7¼6 76 7 (12)
6 w 2 7 6 0D 09 0E 0B 7 6 xc2 7
4 c 5 4 5 4 5
4.2 Mixcolumn operation and inverse MixColumn 0
wc 3 0B 0D 09 0E xc3
architecture
In GF, XOR operation is the matrix multiplication process.
0
Generally, the left shift operation is used for multiplication of 2 wc0 ¼ 0E:xc0  0B: x  0D:xc2  09:xc3 (13)
that consumes more area and power in combinational circuit
logic. In an alternate, Xtime multiplication is introduced to Here 0E, 0B,0D and 09 are constant coefficient in hexadecimal
reduce the computation of multiplication 2. This method format; 8-bit datawc1, wc2, wc3, wc4 are generally called as
occupies the 4 XOR gates to perform the GF Xtime state values. It uses multiplication and modulo 2 (XOR)
multiplication. Further, to minimize the number of gates, a addition operation.
modified GF operation is proposed and implemented by using The expanded operation of equation (13) is as follows:
only three XOR gates as shown in Figure 5. The logic diagram
0
of Xtime02 multiplier is shown in Figure 5: wc0 ¼ 08 ðxc1  xc2  xc3  xc0 Þ  04 ðxc0  xc2 Þ
2 3 2 3 2 3
wc0 02 03 01 01 xc0  02 ðxc0  xc1 Þ  xc0  xc2  xc3  xc1  xc0
6 7 6 7 6 7 (14)
6 wc 7 6 01 02 03 01 7 6 xc 7
6 17 6 7 6 17
6 7¼6 76 7 (9)
6 wc2 7 6 01 01 02 03 7 6 xc2 7
4 5 4 5 4 5 Figure 6 Forward Mixcolumn transformation using simplified function
wc3 03 01 01 02 xc3
Xc3 Xc2

8 bit
Figure 5 Xtime 02 multiplier by using the GF operation
8 bit
b0 b’0
Wc0
b1 b’1 02 Multiplier

b2 b’2
b3 b’3 Wc1

b4 b’4 02 Multiplier

b5 b’5
Wc2
b6 b’6 02 Multiplier

b7 b’7
Wc3
02 Multiplier

Xc1 Xc0
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

0
wci ¼ 08 ðop1 Þ  04 ðop2 Þ  02 ðop3 Þ  op1  xci 5.1 Proposed SBox design – delay and power analysis
(15) Table 1 shows the comparison between various existing
architectures and proposed architectures in Spartan 3E device.
In general, The choice of Spartan 3E is only for comparing the
effectiveness of proposed architecture against conventional
op1 ¼ xci  xci 1 1  xci 1 2  xci 1 3 mod 4; (15a) logic architecture. These results discussed the effectiveness of
proposed SBox in terms of area, power and delay parameters.
op2 ¼ xci  xci 1 2 ; (15b) Table 1 shows the comparison of power and delay of different
architecture models.
Table 1 shows that compared to non-pipeline architecture,
op3 ¼ xci  xci 1 1 ; (15c) there is a maximum power saving of 39.22% and delay
reduction of 68.41% in the proposed design. The reduction in
where i = 0,1,2,3 power and delay is because of the replacement of combination
The equations (15a), (15b), (15c) are used in iterative logic SBox with the composite filed SBox. With these
module architecture and overall logic structure is shown in advantages, the proposed architecture uses four stage pipelines
Figure 7. with resource sharing and resource reuse model. This
technique has been used in AES ECB mode for further
analysis.
5. Results and discussion
The results of proposed AES architectures in various Field 5.2 Proposed Advanced Encryption Standard Electronic
Programmable Gate Array (FPGA) devices are discussed in Code Book mode
Sections 5.1 and 5.2, respectively. 5.2.1 Simulation results
The proposed architecture of AES ECB mode 128-bit is
simulated using Xilinx 14.1 tool and synthesised using the
Figure 7 Inverse MixColumn transformation function for Wco cell Xilinx Spartan 6 - xc6slx150 fgg900-1 device as well as Vertex 7
xc3 xc2 xc1 xc0 device to compare with existing models. The parameters
power, area, LUT, delay and operating frequency are measured
tp1 op0
and compared with reference designs. The detailed functional
verification is done by simulation and verified by using the
08 Xime waveforms in Figure 8 and Figure 9. The XPower analysis
Multiplication results of the proposed ECB mode encryption and decryption is
shown in Figure 10.
tp2 op0
wc0
5.2.2 Slice and logic cell utilization analysis
The utilization of slice register and logic cell of the proposed
design in Spartan Device was compared with the values of
04 Xtime different reference architectures. Table 2 shows the utilization
Multiplication
of shift register and logic cell in Spartan Device xc6slx150
fgg900-1.It can be observed that per cent of slice register
tp3 op0 minimization varies from 45.63% to 90.98%.This has been
achieved with help of resource sharing and iteration model
02 Xtime architecture. Similarly, the logic cell utilization has been
Multiplication
minimized by 8.77% to 80.19%. Table 3 lists the area, power,

Table 1 Comparison of power and delay of the different architecture models of SBox
Type of the architecture Type of fictional operation Dynamic power (W) % of power reduction Delay (ns) % delay reduction
Non pipeline Combinational logic SBox 8.278 39.22 19.866 68.41
Composite field SBox 8.278 39.22 18.986 66.95
Operand based logic SBox 8.277 39.22 18.863 66.73
Multiplexer logic SBox 8.278 39.22 18.366 65.83
Two stages pipeline Combinational logic SBox 5.072 0.81 15.760 60.18
Composite field SBox 5.076 0.89 14.412 56.46
Operand based logic SBox 5.160 2.50 14.627 57.10
Multiplexer logic SBox 5.012 0.38 18.608 66.28
Four stages pipeline Composite field SBox 5.064 0.65 6.275 0.00
Operand based logic SBox 5.360 6.14 6.275 0.00
Multiplexer logic SBox 5.061 0.59 6.318 0.68
Proposed design Four Stage with Operand reuse and Iterative 5.031 – 6.275 –
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

Figure 8 AES Encryption simulation waveform for result verification

Figure 9 AES ECB mode Encryption operation verified by simulation waveform

Figure 10 Power analysis report using Xpower analyser

operating frequency and throughput of different reference and and Akramuddin, 2016). It compares logic elements, FF
proposed designs with Virtex 6 device. From Table 3, it can be usages, LUT, buffer control and input/output pins. From
observed that the area and power were considerably reduced Table 4, it is observed that the proposed architecture model has
but at the expense of throughput and operating frequency. achieved a greater reduction of area in terms of all the
Table 4 shows the detailed comparison of utilisation of slice resources. Table 5 shows the slice logic distribution in
logic in the proposed design and the reference design (Srinivas proposed and reference design. It indicates that the area and
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

Table 2 Utilization of slice register and logic cell – spartan device xc6slx150 fgg900-1
Ref design No of slice register % of slice register minimized No of logic cell % of Logic minimized
Hodjat and Verbauwhede (2004) 5177 76.3 13352 79.89
Saggese et al. (2003) 5810 79.23 10799 72.74
Arrag et al. (2013) 2432 50.4 13552 80.19
McLoone and McCanny (2001) 2432 45.63 – –
Chodowiec and Gaj (2003) 12600 90.98 – –
Thulasimani and Madheswaran (2010) – – 2943 8.77
Proposed design 1207 – 2685 –

Table 3 Performance compassion of AES on different Architecture - Virtex-6 XC6VLX240T device


Throughput Throughput/Slice
Design Device No of Slice BRAM (Gbps) Frequency (MHz) Power (w) (Mbps/slice)
Chellam and Natarajan (2018) Virtex-6 2537 0 94.81 740.7 – 37.37
XC6VLX240T

Wang and Ha (2013) Virtex-6 9071 400 40.86 319.29 11.02 (3.67) 4.51
XC6VLX240T
Oukili and Bri (2017) Virtex-6 5759 0 108.69 849.18 17.08
XC6VLX240T

Shanthi Rekha and Saravanan (2019) Virtex-6 1626 0 0.24 166.66 3.87 (3.43) 0.15
XC6VLX240T

Proposed design - Virtex-6 1551 0 0.56 190.658 0.815 (0.722) 0.361
XC6VLX240T
Note:  – leakage power

Table 4 Slice logic utilization (Virtex7 device): AES 128 bit


Parameters Ref (Srinivas and Akramuddin, 2016) Proposed % of Reduction
Number of Slice Registers 3760 1150 69.41
Number of Slice LUTs 10773 3481 67.68
Number of fully used LUT-FF pairs 1465 577 60.61
Number of bonded IOBs 385 391 -1.5
Number of BUFG/BUFGCTRLs 11 01 90.9
Number of LUT Flip Flop pairs used: 13068 3784 71.04

Table 5 Slice logic distribution: (virtex 7 device)


Srinivas and Akramuddin (2016) Proposed
Parameters No of cell % used No of cell % used
Number of occupied Slices: 1,624 out of 108,300 1 1,624 out of 108,300 1
Number of LUT Flip Flop pairs used: 13,068 3,784
Number with an unused Flip Flop: 9,308 out of 13,068 71 2,644 out of 3,784 69
Number with an unused LUT: 2,295 out of 13,068 18 303 out of 3,784 8
Number of fully used LUT-FF pairs: 1,465 out of 13,068 11 837 out of 3,784 22

cell utilization has been dramatically reduced. This reduction calculation. The setup time and hold time are also compared to
also effects in reduction of power consumption of AES validate the output parameters. Table 6 and Table 7 shows the
algorithm that can be used in the handheld devices. comparison of various clock parameters.
Thus, the proposed design model occupied a minimum slice
5.2.3 Timing analysis register because of resource sharing, resource reuse
Timing parameters include the delay of input and output data architecture model. In the SBox design model, the resource
as well as the minimum time period required to compute the reuse and the pipeline concept has led to a reduction in
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

Table 6 Timing summary


Parameters Ref design (Srinivas and Akramuddin, 2016) Proposed Design % of Reduction
Minimum time Period 4.806 ns 5.245 ns -8.3
Maximum frequency 208.073 MHz 190.658 MHz -8.3
Minimum Input arrival time before the clock 4.623 ns 1.601 ns 65.36
Maximum Output time required after clock 4.458 ns 4.4658 ns 0.16

Table 7 Timing constraints for AES 128 encryption


Worst case slack Best case achievable
Parameter Ref (Srinivas and Akramuddin, 2016) design Proposed design Ref (Srinivas and Akramuddin, 2016) design Proposed design
Auto Time space constraint for clock net CLKFUBGP
Setup time – – 4.846 ns 5.061 ns
Hold time 0.026 ns 0.047 ns

resource utilization. Another power and area consuming block Das, S. (2014), “Halka: a lightweight, software friendly block
MCT has been implemented with a modified Xtime multiplier cipher using ultra-lightweight 8-bit S-box”, IACR Cryptology
unit. So, it has reduced the total area utilization and power ePrint Archive, 2014, p. 110.
consumption. The proposed design architecture performance Fan, C.-P. and Hwang, J.-K. (2008), “FPGA implementations
is enhanced compared to the reference designs. of high throughput sequential and fully pipelined AES
algorithm”, International Journal of Electrical Engineering,
6. Conclusion Vol. 15, pp. 447-455.
Farhan, S.M., Khan, S.A. and Jamal, H. (2004), “Mapping of
A low cost multistage pipeline with resource sharing AES – high-bit algorithm to low-bit for optimized hardware
ECB architecture is proposed to achieve the optimized area, implementation”, Proceedings of the 16th International Conference
power and delay. The proposed model has utilized resource on Microelectronics, ICM 2004, IEEE, pp. 148-151.
sharing and resource reuse concepts to achieve a reduction in Fips Pub, N.I. (2020), “197: Advanced encryption standard
area and power. This proposed architecture model has worked (AES)”, Federal information processing standards publication
at nominal operating frequency of 190.658 MHz. The power 197.441, (2001): 0311.
consumption of proposed design is very less and hence can be Fischer, V. and Drutarovský, M. (2001), “Two methods of
adopted for IoT devices. Rijndael implementation in reconfigurable hardware”,
International Workshop on Cryptographic Hardware and
References Embedded Systems, Springer, pp. 77-92.
Gaded, S.H. and Deshpande, A. (2019), “Composite field
Arrag, S., Hamdoun, A., Tragha, A. and Khamlich, S.E. arithematic based S-Box for AES algorithm”, 3rd
(2013), “Implementation of stronger AES by using dynamic International conference on Electronics, Communication
s-box dependent of master key”, Journal of Theoretical & and Aerospace Technology (ICECA), IEEE,
Applied Information Technology, Vol. 53 No. 2. pp. 1209-1213.
Bertoni, G., Macchetti, M., Negri, L. and Fragneto, P. (2004), Good, T. and Benaissa, M. (2005), “AES on FPGA from the
“Power-efficient ASIC synthesis of cryptographic sboxes”, fastest to the smallest”, International Workshop on
Proceedings of the 14th ACM Great Lakes symposium on VLSI, Cryptographic Hardware and Embedded Systems, Springer,
ACM, pp. 277-281. pp. 427-440.
Chellam, M.B. and Natarajan, R. (2018), “AES hardware Gunathilake, N.A., Buchanan, W.J. and Asif, R. (2019), “Next
accelerator on FPGA with improved throughput and generation lightweight cryptography for smart IoT devices:
resource efficiency”, Arabian Journal for Science and implementation, challenges and applications”, IEEE 5th World
Engineering, Vol. 43 No. 12, pp. 6873-6890. Forum on Internet of Things (WF-IoT), IEEE, pp. 707-710.
Chen, S., Hu, W. and Li, Z. (2019), “High performance data Hamalainen, P., Hannikainen, M. and Hamalainen, T. (2005),
encryption with AES implementation on FPGA”, 5th “Efficient hardware implementation of security processing for
International Conference on Big Data Security on Cloud IEEE 802.15. 4 wireless networks”, 48th Midwest Symposium
(BigDataSecurity), IEEE, pp. 149-153. on Circuits and Systems, 2005, IEEE, pp. 484-487.
Chodowiec, P. and Gaj, K. (2003), “Very compact FPGA Hodjat, A. and Verbauwhede, I. (2004), “A 21 54 gbits s fully
implementation of the AES algorithm”, International pipelined AES processor on FPGA”, 12th Annual IEEE
Workshop on Cryptographic Hardware and Embedded Systems, Symposium on Field-Programmable Custom Computing
Springer, pp. 319-333. Machines, IEEE, pp. 308-309.
Curtin, M. (2005), “Organizing DESCHALL”, Brute Force: Huang, C.-W., Chang, C.-J., Lin, M.-Y. and Tai, H.-Y.
Cracking the Data Encryption Standard, New York: Springer. (2007), “Compact FPGA implementation of 32-bits AES
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

algorithm using block RAM”, TENCON 2007-2007 IEEE Srinivas, N.S. and Akramuddin, M.D. (2016), “FPGA based
Region 10 Conference, IEEE, pp. 1-4. hardware implementation of AES Rijndael algorithm for
McLoone, M. and McCanny, J.V. (2001), “High performance encryption and decryption”, International Conference on
single-chip FPGA Rijndael algorithm implementations”, Electrical, Electronics, and Optimization Techniques
International Workshop on Cryptographic Hardware and (ICEEOT), IEEE, pp. 1769-1776.
Embedded Systems, Springer, pp. 65-76. Stallings, W. (2017), Cryptography and Network Security:
Neelima, S. and Brindha, R. (2018), “FPGA-Based principles and Practice, Pearson, Upper Saddle River, NJ.
implementation of AES algorithm using MIX column”, Thulasimani, L. and Madheswaran, M. (2010), “A single
Microelectronics, Electromagnetics and Telecommunications, chip design and implementation of aes-128/192/256
Springer, pp. 233-245. encryption algorithms”, International Journal of
Oukili, S. and Bri, S. (2017), “Hardware implementation of Engineering Science and Technology, Vol. 2,
AES algorithm with logic S-box”, J. Circuits, Systems and pp. 1052-1059.
Computers, Vol. 26 No. 9. Tillich, S., Feldhofer, M. and Großschädl, J. (2006), “Area,
Rais, M.H. and Qasim, S.M. (2009), “FPGA implementation of delay, and power characteristics of standard-cell
Rijndael algorithm using reduced residue of prime numbers”, implementations of the AES S-box”, International
4th International Design and Test Workshop (IDT), IEEE, pp. 1-4. Workshop on Embedded Computer Systems, Springer,
Rajasekar, P. and Mangalam, H. (2015), “Design and pp. 457-466.
implementation of low power multistage AES S box”, Verbauwhede, I., Schaumont, P. and Kuo, H. (2003), “Design
International Journal of Applied Engineering Research, Vol. 10, and performance testing of a 2.29-GB/s Rijndael processor”,
pp. 40535-40540. IEEE Journal of Solid-State Circuits, Vol. 38 No. 3,
Rajasekar, P. and Mangalam, H. (2016a), “Design of low pp. 569-572.
power optimized MixColumn/inverse MixColumn Verser, R. (2020), “Brute Force: Cracking the Data Encryption
architecture for AES”, International Journal of Applied Standard DESCHALL Press Release”, available at: https://web.
Engineering Research, Vol. 11, pp. 922-926. archive.org/:https://web.archive.org/web/20071201071615/
Rajasekar, P. and Mangalam, H. (2016b), “Efficient FPGA http://home.earthlink.net/~rcv007/despr4.htm
implementation of AES 128 bit for IEEE 802.16 e mobile Wong, M.M., Wong, D.M., Zhang, C. and Hijazin, I. (2018),
WiMax standards”, Circuits and Systems, Vol. 7 No. 4,
“Circuit and system design for optimal lightweight AES
p. 371.
encryption on FPGA”, IAENG International Journal of
Rao, M.R. and Sharma, R.K. (2017), “FPGA implementation
Computer Science, Vol. 45 No. 1, pp. 1-10.
of combined AES-128”, 8th International Conference on
Wang, Y. and Ha, Y. (2013), “FPGA-based 40.9-gbits/s
Computing, Communication and Networking Technologies
masked AES with area optimization forstorage area
(ICCCNT), IEEE, pp. 1-6.
network”, IEEE Trans. Circ. Syst. II: Express Briefs, Vol. 60
Rashidi, B. and Rashidi, B. (2013), “Implementation of an
No. 1, pp. 36-40.
optimized and pipelined combinational logic Rijndael S-
Yu, N. (2005), “Compact hardware implementation of AES
Box on FPGA”, International Journal of Computer
with concurrent error detection”, M. Eng. Thesis, Memorial
Network and Information Security, Vol. 5 No. 1,
University of Newfoundland.
pp. 41-48.
Zhang, J., Zuo, Q. and Zhang, T. (2007), “Reducing the power
Rejeb, J., Kaginele, S. and Lee, T. (2006), “Compact and
consumption of the AES S-box by SSC”, International
power conscious private-key cryptosystem for wireless
Conference on Wireless Communications, Networking and Mobile
devices”, International Conference on Wireless and Mobile
Communications (ICWMC’06), IEEE, p. 88. Computing, IEEE, pp. 2226-2229.
Rijmen, V. (2000), Efficient Implementation of the Rijndael S-
Box, Katholieke Universiteit Leuven, Department of
Electrical Engineering ESAT, Belgium.
About the authors
Saggese, G.P., Mazzeo, A., Mazzocca, N. and Strollo, A.G.
(2003), “An FPGA-based performance analysis of the Dr Rajasekar P. received the BE Degree in
unrolling, tiling, and pipelining of the AES algorithm”, Electronics and Communication Engineering
International Conference on Field Programmable Logic and from the Sri Krishna College of Engineering
Applications, Springer, pp. 292-302. and Technology, Bharathiar University,
Satoh, A., Morioka, S., Takano, K. and Munetoh, S. (2001), Coimbatore, Tamil Nadu, India, in 2003, ME
“A compact Rijndael hardware architecture with S-box Degree in Applied Electronics from PSG
optimization”, International Conference on the Theory and College of Technology, Coimbatore, Anna University, in
Application of Cryptology and Information Security, Springer, 2008 and PhD Degree in Information and Communication
pp. 239-254. Engineering from the Anna University, Chennai, in 2018. His
Shanthi Rekha, S. and Saravanan, P. (2019), “Low-Cost AES-128 current research interests include cryptography, low power
implementation for edge devices in IoT applications”, Journal of architectural design in FPGA, network security. He has 14
Circuits, Systems and Computers, Vol. 28 No. 4, p. 1950062. years of academic experience and is member of International
Shanthini, M., Rajasekar, P. and Mangalam, H. (2014), “Design of Association of Engineers, The Society of Digital Information
low power S-box in architecture level using GF”, International and Wireless Communications, International Academy for
journal of engineering research and general science (IJERG), pp. 1-9. Science and Technology Education and Research (IASTER),
Area-optimized AES architecture Circuit World
Rajasekar P. and Mangalam H.

International Association of Engineers (IAENG). Rajasekar P. Communication Engineering from Anna University,
is the corresponding author and can be contacted at: Chennai, in 2009. She is a recognized supervisor for PhD
rajasekarkpr@gmail.com programme under Anna University, Chennai. Her current
areas of research include low power VLSI Design,
cryptography, digital communication and wireless
Dr Mangalam H. received the BE Degree
networks. She has 30 years of academic experience and
in Electronics and Communication
has membership in Professional bodies such as Institute of
Engineering in 1988 and M.E degree in
Electrical and Electronics Engineers (IEEE), Institution
Applied Electronics in 1993, from
of Engineers (India) (IE), The Indian Society for
Bharathiar University from PSG College of
Technical Education (ISTE) and Systems Society of India
Technology, Coimbatore, Tamil Nadu,
(SSI).
India. She obtained PhD Degree in Information and

For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy