FPGA Lec01 Intro
FPGA Lec01 Intro
Lecture 1
Introduction and FPGA Architecture
General Computing Flows
Applications Algorithms
Filters, compression, neural network, encryption
High
Hardware description language Level Software
Verilog, VHDL Synthesis C, Python, TensorFlow
Lecture 1 2
Computing Platforms
Computing efficiency
Dedicated circuits
ASICs
FPGA
Field Programmable
Gate Array
Software on processors
• Highly crafted hardware
• Large system overhead
• Instructions, OS
Application flexibility
Lecture 1 3
What You Will Learn
Computing platform: FPGA-based synthesis & design
Circuit design techniques
n Parallel processing, pipelining
n Folding, unfolding
n Fast convolution, fast FIR
Hardware friendly application algorithms
n Arithmetic
n Filters
n JPEG decoder
n Viterbi decoder
n Neural network
Lecture 1 4
Books
Lecture 1 5
FPGA Applications
Embedded computing
n Baseband processing, driverless car
Cloud computing, e.g., Microsoft datacenter
Processor accelerator
Design prototyping
Lecture 1 6
FPGA Architecture Overview
Wire switches
provide flexibility,
but are slow
Lecture 1 7
FPGA Programming Technology
Flash
Anti-fuse
SRAM
Lecture 1 8
How to Store 0 and 1 in ROM?
0 1
V_read=1 V_read=1
Lecture 1 9
Flash (EEPROM) Transistor
Cannot be
Can be
turned on
turned on
V_read
is applied on control gate
NAND Flash
• Word line > Hi_Vt if not read
• Erased: bit line == 0
• Programmed: bit line == 1
NOR Flash
• Word line < Lo_Vt if not read
• Erased: bit line == 0
• Programmed: bit line == 1
Lecture 1 11
Pros and Cons of Flash FPGA
Pros
n Non-volatile
n Smaller dimension than SRAM
n Live at power-up
n Reprogrammable
n Tolerance to soft errors
Cons
Logic function: mapping n Write requires high voltage
from address to memory n Hard to follow the most
content advanced process technology
n Limited number of writes
n Large resistance and load
capacitance
Lecture 1 12
Antifuse for Poly Connection
Lecture 1 13
CONDUCTOR
INSULATOR INSULATOR
Routing Tracks
Metal 3
Amorphous Silicon/
Dielectric Antifuse
Metal 2
Metal 1
Tungsten Plug
Contact
Silicon Substrate
r
Figure 5.27 Actel⃝ ProASICP LU S FPGA(Actel Corp. 2007c). Reproduced by permission of
Actel Corp.
Lecture 1 14
Pros and Cons of Antifuse FPGA
Pros
n Non-volatile
n Small dimension
n High density
n Small resistance
n Small load capacitance
n Tolerance to soft errors
Cons
n Cannot reconfigure
n 2 transistors to program each
connection
n Long programming time
n Hard to guarantee 100% yield
Lecture 1 15
Pros and Cons of SRAM FPGA
Pros
n Most advanced CMOS process
n Reconfigurable
n Unlimited write
Cons
n Large dimension
n Volatile
n Sensitive to soft errors
n Large resistance and load capacitance
Lecture 1 16
Lookup Table (LUT)
SRAM cells
Function
input
Function output
Y = f(A,B,C,D)
Lecture 1 17
Basic Logic Element (BLE)
D Q
4-LUT
Lecture 1 18
Carry Logic
Cout
a Carry logic
0 1
b
sum
LUT
Lecture 1 19
Logic Cluster
4 BLE0
4
BLE1
10 Switch
Matrix
14X16 4
BLE2
4
BLE3
Lecture 1 20
Logic Hierarchy
Xilinx (now part of AMD)
nSuper Logic Region (SLR) a die on interposer
RANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 38, NO. 11, NOVEMBER 2019
n CLB (Configurable Logic Block)
Altera (now
mentioned deficiencies partFig.of2. Intel)
in previous CLB and BLE in Xilinx UltraScale architecture.
ew paradigm for FPGA placement
n Adaptive Logic Module (ALM)
We call it Place-Legalize flow. As
LUT
proposed flown (the green path), a smaller LUTs with a total number of distinct inputs no greater
e achieved directly from an FIP by than 5. The 2 FFs in a BLE must share the same clock (CK)
sly separated packing and final place- and set/reset (SR) 1signals, however, their clock enable (CEA
Lecture 21
Our experiments show that the overall and CEB) signals can be different. A CLB can be divided into
Routing Architecture
Segment length:
tradeoff routing flexibility
vs timing performance
Lecture 1 22
Routing Switches
Lecture 1 23
IO Blocks (Xilinx)
Xilinx 4000 IOB
Lecture 1 24
DSP Blocks (Altera Cyclone V)
Lecture 1 25
On-Chip Memory
Block RAM, dedicated
Distributed RAM, configured
100 from logic
FPGA-based Implementation blocks
of Signal Processing Systems
Figure 5.21 Block RAM logic diagram(Xilinx Inc. 2007b). Reproduced by permission of Xilinx
Inc.
Lecture
Table 5.9 Memory sizes for 1
TM
Xilinx Virtex -5 block 26
and Congestion Aware Packing
i Li, Shounak Dhar, and David Z. Pan, Fellow, IEEE
Modern Architecture
e array (FPGA) packing
consideration could lead
ion designs. Conventional
aches are shown to have
lity. In this paper, we pro-
nt engine called UTPlaceF
th and routability. A novel
ng algorithm and a hier-
e are proposed. UTPlaceF
placers simultaneously in
ernational Symposium on
ark suite. Compared with
FPGA placement contest,
and 29.1% better routed
ION
array (FPGA) is a type of
uit designed to be config-
GAs are becoming more
of their ability to repro- Fig. 1. Typical island-style FPGA.
time to market, and lower Lecture 1 27