0% found this document useful (0 votes)
9 views6 pages

19

Uploaded by

Surendra Guntur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

19

Uploaded by

Surendra Guntur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

1

FPGA Implementations of
the ICEBERG Block Cipher
François-Xavier Standaert, Gilles Piret, Gael Rouvroy, Jean-Jacques Quisquater
UCL Crypto Group, Place du Levant, 3, B-1348 Louvain-La-Neuve, Belgium.
e-mail: standaert,piret,rouvroy,quisquater@dice.ucl.ac.be

Abstract— This paper presents FPGA (Field Programmable The paper is structured as follows. Section 2 briefly presents
Gate Array) implementations of ICEBERG, a block cipher de- the specifications of ICEBERG and Section 3 describes our
signed for reconfigurable hardware implementations and pre- FPGA design methodology. Section 4 lists the combinatorial
sented at FSE 2004. All its components are involutional and
allow very efficient combinations of encryption/decryption. The cost of the block cipher components. The implementation
implementations proposed also allow changing the key and results for various architectures are in Sect. 5 and comparisons
Encrypt/Decrypt (E/D) mode for every plaintext, without any with other block ciphers are in Sect. 6. Resistance against
performance loss. In comparison with other recent block ciphers, side-channel analysis is briefly discussed in Sect. 7. Finally,
the implementation results of ICEBERG show a significant im- conclusions are in Sect. 8.
provement of hardware efficiency. Moreover, the key and E/D
agility allows considering new encryption modes to counteract
certain side-channel attacks. II. S PECIFICATIONS
A. Block and Key Size
I. I NTRODUCTION
ICEBERG operates on 64-bit blocks and uses a 128-bit key. It
In October 2000, NIST (National Institute of Standards is an involutional iterative block cipher based on the repetition
and Technology) selected Rijndael as the new Advanced of 16 identical key-dependent round functions. In the next
Encryption Standard. The selection process included subsections, we briefly present the algorithm. A more detailed
performance evaluation on both software and hardware description can be found in the original paper [1].
platforms. However, as implementation versatility was a
criteria for the selection of the AES, it appeared that Rijndael S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0

was not optimal for reconfigurable hardware implementations. P8 P8 P8 P8 P8 P8 P8 P8

Its highly expensive substitution boxes are a typical bottleneck


S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 Non-Linear Layer
but the combination of encryption and decryption in hardware
is probably as critical. P8 P8 P8 P8 P8 P8 P8 P8

S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0 S0
ICEBERG is a block cipher designed for efficient
reconfigurable hardware implementations. It is based on P64

an involutional structure so that the forward and inverse D D D D D D D D D D D D D D D D

operation of the cipher may be performed with exactly the ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ Diffusion +


Key addition

same hardware. All its components easily fit into the 4-bit P4 P4 P4 P4 P4 P4 P4 P4 P4 P4 P4 P4 P4 P4 P4 P4

input lookup tables1 of FPGAs, and its key scheduling allows P64

the round keys to be derived “on the fly” in encryption and


decryption mode. In addition to hardware efficiency, the key Fig. 1. The round function.
and E/D agility allows considering new encryption modes
to counteract certain side-channel attacks. In practice, very
low-cost hardware crypto-processors and high throughput B. The round function
data encryption are potential applications of ICEBERG. The round function is pictured in Fig. 1, where we distinguish
a non-linear layer and a linear diffusion layer.
This paper presents FPGA implementations of ICEBERG
and compares their performances with the ones of recent The non-linear layer is built from the parallel application
block ciphers (e.g. AES and NESSIE candidates). Although of 8 × 8 substitution boxes to the cipher state. For efficiency
ICEBERG implementations offer features that most block purposes, these boxes are constructed from smaller 4 × 4
ciphers do not provide (e.g. key and E/D agility), its S-boxes S0, S1 and bit permutations P 8 (i.e. 8-bit wire
implementation results exhibit a significant improvement of crossings).
hardware efficiency. For this purpose, we investigated various
The linear diffusion layer is built from bit permutations
contexts (loop and unrolled implementations, with or without
P 64 (i.e. 64-bit wire crossings), bit permutations P 4 (i.e.
feedback) on the recent Xilinx Virtex-II technology.
4-bit wire crossings), bitwise key additions (denoted as ⊕ in
1 LUTs are 4-bit input function generators and constitute the basic building the figure) and small 4 × 4 diffusion boxes D. These boxes
block of most recent reconfigurable devices. perform a simple multiplication:

Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’05)
0-7695-2315-3/05 $ 20.00 IEEE
2

     
y3 0 1 1 1 x3 KeySelection(expandedkey[16],not(sel),roundkey[16]);
 y2   1 0 1 1   x2 
   × 
 y1  =  1 1 0 1   x1 
AddRoundKey(state,roundkey[0]);
y0 1 1 1 0 x0 for (i=1;i<16;i++)
where every output bit is a ⊕ operation between three input {
bits. It is therefore efficiently combined with the key addition Round(state,roundkey[i]);
inside a single 4-input LUT. }
NonLinearLayer(state);
AddRoundKey(state,roundkey[16]);
C. The key schedule
}
The key scheduling process consists of key expansion and
key selection. The round constants are : C = 0 until round 8, C = 1
thereafter. A particular structure of the expanded key is
The key expansion expands the cipher key K into a therefore obtained:
sequence of keys K 0 , K 1 , ..., K 16 . We set the initial key
K 0 = K. The following keys are obtained by a keyround K0 = K 16
function so that : K i+1 = keyround(K i ). K1 = K 15
The keyround is pictured in Fig. 2, where we distinguish a ... (1)
conditional shift layer, bit permutations P 128 (i.e. 128-bit
As a consequence, ICEBERG allows the encryption/decryption
wire crossings) and S-boxes S0. The conditional shift
with exactly the same hardware (only the selection bit has to
operation depends on a round constant C that will be
be changed) and the expanded key may be derived “on the
discussed further.
fly” in encryption and decryption (the storage of round keys
SHIFT Left/Right is not necessary). More details about this particular structure
P128
are available in the paper of FSE 2004.
S0 S0 S0 S0 S0 S0 S0 S0 .... S0 S0 S0 S0 S0 S0
III. D ESIGN METHODOLOGY
P128
Present reconfigurable components like FPGAs are usually
SHIFT Left/Right
made of reconfigurable logic blocks combined with fast access
memories (RAM blocks) and high speed arithmetic circuits
Fig. 2. The key round.
[2], [3]. Basic logic blocks of FPGAs include a 4-input
Finally, the key selection first performs a simple compression function generator (called lookup table, LUT) and a storage
function that selects 64 bytes of K i having odd indices. element. In addition, most FPGA manufacturers provide users
Thereafter, a 4 × 4 key selection box is applied in parallel with fast carry logic and particular structures of the logic
to every 4-bit key-block. It performs the following boolean blocks to efficiently implement distributed memories, shift
operation: registers,... A brief description of these components is given
in Appendix.
y(0) = (x(0) ⊕ x(1) ⊕ x(2)) · sel ∨ (x(0) ⊕ x(1)) · sel
y(1) = (x(1) ⊕ x(2)) · sel ∨ x(1) · sel As reconfigurable components are divided into logic elements
y(2) = (x(2) ⊕ x(3) ⊕ x(0)) · sel ∨ (x(2) ⊕ x(3)) · sel and storage elements, an efficient implementation will be the
y(3) = (x(3) ⊕ x(0)) · sel ∨ x(3) · sel result of a better compromise between combinatorial logic
Depending on the value of a selection bit sel, we obtain the used, sequential logic used and resulting performances. These
round key RK0i or RK1i for the round i. observations lead to different definitions of implementation
efficiency:
D. Encryption/decryption process 1) In terms of performances, let the efficiency of a block ci-
pher be the ratio T hroughput (M bits/s)/Area (LU T s,
The complete cipher consists of an initial round key addition, RAM blocks).
15 rounds and a final transform. Due to the involutional 2) In terms of resources, the efficiency is easily tested by
structure of every single component of ICEBERG, the computing the ratio N br of registers/N br of LU T s:
E/D mode is fixed with the selection bit only: sel = 1 it should be close to one.
in encryption and sel = 0 in decryption. In pseudo C, we have:
ICEBERG was designed in order to allow very efficient
ICEBERG(state,cipherkey,sel) FPGA implementations and our architectures are defined
{ in order to maximize these notions of hardware efficiency.
KeyExpansion(cipherkey,expandedkey[0..16]); It practically results in the pipelining of the round and
for (i=0;i<16;i++) keyround functions. Pipelining increases the encryption speed
{ by processing multiple blocks of data simultaneously. It is
KeySelection(expandedkey[i],sel,roundkey[i]); achieved by inserting rows of registers among combinatorial
} logic. Parts of logic between two consecutive registers form

Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’05)
0-7695-2315-3/05 $ 20.00 IEEE
3

Type # of # of Latency Out. every Freq. Throughput


pipeline stages and we define the maximum pipeline as the slices RAMBs (cycles) (cycles) (Mhz) (Mbits/sec)
pipeline of which the number of stages implies that the ratio Full Pipe 6808 0 66 1 297 19008
Half Pipe 4946 0 33 1 271 17344
N br of registers/N br of LU T s is the closest to one (and RAM 3132 64 33 1 210 13440
lower than one).
TABLE I
Finally, depending on the optimization criteria, different U NROLLED ARCHITECTURES RESULTS ON V IRTEX - II .
architectures can be employed. Optimization for maximum
speed can be achieved by a fully pipelined unrolled maximum pipeline inserted. However, for large designs, the
architecture. In the applications requiring minimum area, a implementation (and specially the routing task) may become
loop architecture with only one round implemented seems to the bottleneck, with routing delays larger than logic delays.
be the best choice. For both cases, we tried to maximize the Therefore, for an optimized efficiency, we propose the half
previously defined efficiency. In addition, we provide results pipe architecture. In addition to a better tradeoff between logic
of non-pipelined implementations that are useful when a and routing delays, it also allows an efficient implementation
block cipher is used in feedback modes. In the next sections, of the shift layer, using the additional multiplexers available
we present the results of various FPGA implementations of inside the Virtex slice. Both architectures are pictured in Fig. 3.
ICEBERG. Finally, if the half pipe architecture is considered, we can also
implement the round S-box inside the FPGA RAM blocks.
The implementation results for these three proposals are in
IV. C OMBINATORIAL COST OF ICEBERG COMPONENTS
Table II.
As all components perfectly fit into 4-input LUTs, we can
directly evaluate their combinatorial cost in the Xilinx Virtex- text key

II family of devices: S


E
L

Round Keyround text key


S0
Components HW cost (LUTs) Components HW cost (LUTs)
shift
S0 , S1 layers 64 Shift layer 128 S1
S
E

Non-linear layer 64 × 3 = 192 S0 layer 128 S0


L

keyround
Linear diffusion layer 64 Keyround 384 round S0 S0
Round 256 Selection layer 64 shift S1 shift

D S0
Remark that if the maximum pipeline is not inserted, the

keyround
S

round
E S0
L
shift layers can be efficiently implemented inside the Virtex D
S
shift

slice, using additional multiplexers F 5 and F 6 available next round


S
E
L
keyround
E
L

to the LUT [2]. round


S
E
L
keyround
round
S
E keyround
L

S
round E keyround

In the next section, we investigate the practical implementation round


S
E
L
keyround
L

of different architectures for ICEBERG. S0


keyround
round
S
E keyround
L

S1 S0
keyround

V. I MPLEMENTATION RESULTS S0
S1

All the architectures proposed in this section allow the choice S


E
S0
S
E

of the key and E/D mode for every plaintext. The area and L L

frequency estimations presented are provided after implemen- cipher cipher

tation with Xilinx ISE 6.1 on the Xilinx Virtex-II technol- Fig. 3. Unrolled architectures : full pipe and half pipe.
ogy. The timing constraints were applied to the inner clock
and we used the input-output (IO) registers embedded into the B. Loop architectures
FPGA IOBs2 in order to take the interface constraints into In the applications requiring minimum area, we propose a loop
account. It is important to note that the limiting factor of our architecture with only one round implemented. In order to
work frequencies was always the input-output management. decrease the area requirements, we only considered the half
As an illustration, the internal clock of the fully pipelined pipe strategy. In addition to the efficiency advantages already
unrolled implementation without IO registers is near to the mentioned, half pipe structures are specially convenient for
maximum (380 Mhz), but if IO registers are considered, it loop architectures because they allow the combination of the
decreases to 297 Mhz, what we believe to be a fair frequency loop multiplexer with the round and keyround logic. Our
estimation. proposal is pictured in Fig. 4, where we share the initial and
final key addition. As for unrolled architectures, it is possible
A. Unrolled architectures to use the FPGA RAM blocks to implement the round S-box.
The implementation results for these loop architectures are
For high throughput applications, we propose an unrolled
provided in Table II.
implementation with the 16 rounds implemented and we
applied two pipelining strategies. If a maximum throughput
C. Feedback modes
is required, a full pipe implementation is provided, with the
As soon as a feedback mode is used, pipelining techniques
2 IOBs : Input-Output Blocks. are not relevant for block cipher implementations. This is due

Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’05)
0-7695-2315-3/05 $ 20.00 IEEE
4

text

VI. C OMPARISONS WITH OTHER BLOCK CIPHERS


cipher

Comparing the performances of block cipher hardware


key
implementations is generally a delicate task. This is due
to the high dependency of these implementation results on
the design methodology, but also to the various commercial
S0
S
E
L
FPGAs that may be chosen for evaluation. In the case of
S1 shift ICEBERG, it is even more critical as our implementations

keyround
round
provide key and E/D agility: two properties that are never
combined in other block cipher implementations3 . The
S0 S0

D shift

following considerations must therefore be taken with care


and should be considered as general guidelines more than as
a strict comparison.
Fig. 4. Loop architecture.
We tried to find the best results for various block ciphers in
Type # of # of Latency Out. every Freq. Throughput
non feedback modes, if possible in the most recent technology
slices RAMBs (cycles) (cycles) (Mhz) (Mbits/sec) (Virtex-II ). Then, we provide the area and throughput results.
Loop 631 0 34 2/32 254 1016 If no RAMBs are used, the ratio Throughput/Area is given
RAM 526 4 34 2/32 227 908
in order to estimate the hardware efficiency. We also specify
TABLE II
the architecture used (loop or unrolled) and its basic features
L OOP ARCHITECTURE RESULTS ON V IRTEX - II . (encryption only, encryption/decryption, key agility).
In general, ICEBERG implementations exhibit a significant
to the fact that multiple blocks of data cannot be managed improvement of the hardware efficiency, even if we compare
in parallel because encrypting one block of data requires them with encryption only designs. It is clear that the
the result of the previously encrypted block. Although we most relevant implementation schemes for ICEBERG do
do not recommend the use of feedback modes in FPGA not use RAMBs because they considerably increase the
implementations of block ciphers (they do not allow us to take S-box memory requirements4 . LUT-only implementations
advantage of hardware efficiency), we propose the following are also the best estimators for ASIC performances and
designs for comparison purposes. An unrolled architecture underline the excellent potentialities of ICEBERG for
without pipelining and a minimum latency loop architecture hardware implementations in general. More specifically, only
are represented in Fig. 5. The implementation results of these Rijndael [12] and the 3DES have an efficiency comparable
designs are in Table III. to ICEBERG with an E/D structure. However, the specified
text key
Rijndael implementation does not provide key and E/D
S
agility, uses RAM blocks and shares resources between the
E
L
round and keyround. For 3DES, it is well known that it
S0

S1 shift
allows very efficient implementation opportunities and having
keyround

text
round

S0 S0 a comparable efficiency is probably an excellent result for


D shift
S
E
cipher ICEBERG.
L

S
round E keyround key
L

S VII. R ESISTANCE AGAINST SIDE - CHANNEL ATTACKS


round E keyround
L

Although cryptosystem designers frequently assume that


round
S
E
L
keyround S0 S
E
secret parameters will be manipulated in closed reliable
S0
keyround
S1 L
computing environments, Kocher et al. stressed in 1998
keyround

shift
round

S1
S0 S0
[17] that actual computers and microchips leak information
S0
S
E D correlated to the data handled. Side-channel attacks based
L
shift on time, power and electromagnetic measurements were
cipher successfully applied to smart card implementations of block
Fig. 5. Feedback mode : Unrolled and loop architectures. ciphers. Protecting implementations against side-channel
attacks is usually difficult and expensive. Masking all the data
Type # of # of Latency Out. every Freq. Throughput with random boolean values is suggested in several papers
slices RAMBs (cycles) (cycles) (Mhz) (Mbits/sec)
[18], [19] and the use of small substitution tables allows this
Unrolled 3174 0 1 1 14 896
Loop 571 0 17 1/16 147 588 to be efficiently implemented, although it is still an expensive
RAM 467 4 17 1/16 145 580 solution (the additional cost of masking a 2n -bit table is
another 22n -bit table).
TABLE III
F EEDBACK MODE RESULTS ON V IRTEX - II . 3 Excepted in the Triple-DES.
4 The ICEBERG S-box memory requirements are : (24 × 4) × 6 = 384
bits. If RAMBs are used, it becomes 28 × 8 = 2048 bits.

Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’05)
0-7695-2315-3/05 $ 20.00 IEEE
5

Algorithm Device Enc. Dec. Key ag. Loop/Unr.


0.22 µm
Twofish [4] Virtex • • U
Serpent [4] Virtex • • U
0.18 µm
Rijndael [5] Virtex-E • • U
Camelia [6] Virtex-E • • U
Khazad [7] Virtex-E • • U
Misty1 [7] Virtex-E • • U
Rijndael [5] Virtex-E • • L
0.15 µm
RC6 [8] Virtex-II • • U
IDEA [9] Virtex-II • • U
SHACAL-1 [10] Virtex-II • • U
3DES [11] Virtex-II • • • U
ICEBERG Virtex-II • • • U
3DES [11] Virtex-II • • • L
ICEBERG Virtex-II • • • L
0.15 µm + RAMBs
Rijndael [12] Virtex-II • • L
ICEBERG Virtex-II • • • L
Rijndael [13] Virtex-II • • L
ICEBERG Virtex-II • • • U

TABLE IV
BASIC FEATURES OF COMPARED BLOCK CIPHERS .

Algorithm # Slices # RAMBs Throughput Thr./Area


(Mbits/sec) (Mbits/sec / slices)
0.22 µm
Twofish [4] 21000 0 15200 0.72
Serpent [4] 19700 0 16800 0.85
0.18 µm
Rijndael [5] 2784 100 11776 -
Camelia [6] 9692 0 6750 0.7
Khazad [7] 7175 0 7872 1.10
Misty1 [7] 6322 0 10176 1.61
Rijndael [5] 2524 0 2085 1.17
0.15 µm
RC6 [8] 7456 0 4800 0.64
IDEA [9] 9793 0 6800 0.69
SHACAL-1 [10] 13729 0 17021 1.24
3DES [11] 604 0 917 1.51
ICEBERG 4946 0 17344 3.51
3DES [11] 227 0 326 1.44
ICEBERG 631 0 1016 1.61
0.15 µm + RAMBs
Rijndael [12] 146 3 358 -
ICEBERG 526 4 908 -
Rijndael [13] ≈1125 18 1408 -
ICEBERG 3132 64 13440 -

TABLE V
P ERFORMANCES OF COMPARED BLOCK CIPHERS .

Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’05)
0-7695-2315-3/05 $ 20.00 IEEE
6

The key agility provided by ICEBERG (changing the [11] G. Rouvroy, F.-X. Standaert, J.-J. Quisquater, J.-D. Legat, Optimizing
key at every plaintext block is for free) also offers interesting Cipher FPGA Implementations : DES and Triple DES, in the proceedings
of FPL 2003, Lecture Notes in Computer Science, vol. 2778, pp 181-193,
opportunities to prevent certain side-channel attacks by Sppringer-Verlag, 2003.
defining new encryption modes where the key is changed [12] G. Rouvroy, F.-X. Standaert, J.-J. Quisquater, J.-D. Legat, Compact and
sufficiently often. As most side-channel attacks need to Efficient Encryption/Decryption Module for FPGA Implementation of the
AES Rijndael Very Well Suited for Small Embedded Applications, in the
collect several leakage traces to remove the noise from useful proceedings of ITCC 2004, Las Vegas, April 5-7 2004.
information, changing the key frequently, even in a well [13] Helion Encryption Cores : http://www.heliontech.com/
chosen deterministic way (e.g. LFSR-based), could help to [14] P. Chodowiec, K. Gaj, Very Compact FPGA Implementation of the AES,
in the proceedings of CHES 2003, Lecture Notes in Computer Sciences,
counteract (or at least make more difficult) these attacks. vol 2779, pp 319-333, Springer-Verlag, 2003.
A thorough analysis of side-channel resistance based on [15] A.J. Elbirt, W. Yip, B. Chetwynd, C. Paar, An FPGA Implementation and
re-keying techniques would deserve further research and Performance Evaluation of the AES Block Cipher Candidate Algorithm
Finalists, in the proceedings of the Third AES Candidate Conference,
analysis. April 13-14 2000, New York, USA.
[16] B. Weeks, M. Bean, T. Rozylowicz, C. Ficke, Hardware Perfor-
VIII. C ONCLUSION mance Simulations of Round 2 Advanced Encryption Standard Al-
gorithms, NSA Final Report on AES candidates, available from
We presented FPGA implementations of ICEBERG, a block http://csrc.nist.gov/CryptoToolkit/aes/round2/NSA-AESfinalreport.pdf
[17] P.Kocher, J.Jaffe, B.Jun, Differential Power Analysis, in the proceedings
cipher designed for hardware implementations. In terms of CRYPTO 99, Lecture Notes in Computer Science 1666, pp 398-412,
of area requirements, throughput and hardware efficiency, Springer-Verlag.
ICEBERG exhibits excellent abilities compared to most recent [18] L.Goubin,J.Patarin, DES and Differential Power Analysis: The Dupli-
cation Method, in the proceedings of CHES 1999, Lecture Notes in
block ciphers. The simplicity of the design is also considerably Computer Science 1717, pp 158-172, Springer-Verlag.
improved and allows the fast development of an efficient [19] S.Chari et al., Towards Sound Approaches to Counteract Power-Analysis
architecture. In practice, an unrolled (resp. loop) architecture Attacks, in the proceedings of CRYPTO 1999, Lecture Notes in Computer
Science 1666, pp 398-412, Springer-Verlag.
has a throughput of 17,3 Gbits/sec (resp. 1,0 Gbits /sec), [20] S.Chari, J.Rao, P.Rohatgi, Template Attacks, in the proceedings of CHES
using 4946 FPGA slices (resp. 631 FPGA slices) in the Xilinx 2002, Lecture Notes in Computer Science 2523, pp 13-28, Springer-
Virtex-II technology. In addition, ICEBERG allows key and Verlag.
E/D agility. These properties could be used to improve
resistance against certain side-channel attacks, although this A PPENDIX
last point is let as a scope for further research. Due to the All the implementation results provided in this paper were obtained using
Xilinx Virtex-II devices [2]. In general, FPGAs may be viewed as a “sea”
simplicity of its component functions, ICEBERG is also likely of programmable logic gates where the logic, but also the routing are user
to exhibit excellent implementation results in hardware in programmable. This section briefly describes these components.
general (not only FPGAs). The main element of the Xilinx Virtex-II devices is the Configurable
Logic Block (CLB) that is made up of two slices, each one divided into two
Logic Cells (LC). An LC includes a 4-input function generator, carry logic
and a storage element. The output from the function generator in each LC
R EFERENCES drives both the CLB output and the D input of the flip-flop. Figure 6 shows
a simplified view of a single slice.
[1] F.-X. Standaert, G. Piret, G. Rouvroy, J.-J. Quisquater, J.-D. Legat,
ICEBERG : an Involutional Cipher Efficient for Block Encryption in Virtex-II function generators are implemented as 4-input LUTs that can also
Reconfigurable Hardware, in the proceedings of FSE 2004, the Fast provide a 16×1-bit synchronous RAM or a 16-bit shift register. In addition,
Software Encryption workshop, New Delhi, February 5-7 2004, Springer- the F5 multiplexer in each slice combines the LUT outputs. This combination
Verlag. provides a function generator that implements any 5-input function, a 4:1
[2] Xilinx: Virtex 2 FPGAs Data Sheet, http://www.xilinx.com. multiplexer, a 32 × 1-bit synchronous RAM or selected functions of up
[3] Altera: Stratix 1.5V FPGAs Data Sheet, http://www.altera.com. to nine bits. Similarly, the F6 multiplexer combines the outputs of all four
[4] K. Gaj, P. Chodowiec, Fast Implementation and fair Comparison of the LUTs in the CLB by selecting one of the F5-multiplexer outputs. Finally,
Final Candidates for the Advanced Encryption Standard using Field the arithmetic logic includes fast carry chains and additional logic gates
Programmable Gate Arrays, in the proceedings of the RSA Security (e.g. XORCY) to improve the efficiency of adder/multiplier implementations.
Conference - Cryptographer’s Track, San Francisco, CA, April 8-12,
Cout
2001, pp. 84-99.
[5] F.-X. Standaert, G. Rouvroy, J.-J. Quisquater, J.-D. Legat, Efficient LUTout1

Implementation of Rijndael Encryption in Reconfigurable Hardware : LUT


in1
LUT4 carry
XORCY
Rout1
Improvements and Design Tradeoffs, in the proceedings of CHES 2003,
Lecture Notes in Computer Sciences, vol 2779, pp 334-350, Springer- F6in
F6 F6out
Verlag, 2003.
F5 F5out
[6] T.Ichikawa, T. Sorimachi, T. Kasuya, M. Matsui, On the Criteria of
Hardware Evaluation of Block Ciphers, Tech. report of IEICE, ISEC
Rout2
2001. LUT LUT4 carry
XORCY
in2
[7] F.-X. Standaert, G. Rouvroy, J.-J. Quisquater, J.-D. Legat, Efficient FPGA LUTout2
Implementation of Block Ciphers Khazad and Misty1, 3rd NESSIE
Cin
Workshop, Munich, Germany, November 2002.
[8] J.-L. Beuchat, FPGA Implementation of the RC6 Block Cipher, in the
Fig. 6. The Virtex-II slice.
proceedings of FPL 2003, Lecture Notes in Computer Science, vol. 2778,
pp 101-110, Sppringer-Verlag, 2003. Virtex-II FPGAs also incorporate several large RAM Blocks (RAMB). These
[9] J.-L. Beuchat, Modular Multiplication for FPGA Implementation of the ones complement the distributed LUT implementations of RAMs. Every
IDEA Block Cipher, in the proceedings of ASAP 2003, Application block is a fully synchronous dual-ported RAM with independent control
Specific Systems Architectures and Processors, IEEE, 2003. signals for each port. The data widths of the two ports can be configured
[10] M. McLoone, J.V. McCanny, Very High Speed 17 Gbps SHACAL independently.
Encryption Architecture, in the proceedings of FPL 2003, Lecture Notes
in Computer Science, vol. 2778, pp 111-120, Sppringer-Verlag, 2003.

Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’05)
0-7695-2315-3/05 $ 20.00 IEEE

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy