ch05 1
ch05 1
Chapter 5:
Register-Transfer Level
(RTL) Design
Slides to accompany the textbook Digital Design, with RTL Design, VHDL,
and Verilog, 2nd Edition,
by Frank Vahid, John Wiley and Sons Publishers, 2010.
http://www.ddvahid.com
Introduction
• Chpt 2
Higher levels
Register-
– Capture Comb. behavior: Equations, truth tables transfer
– Convert to circuit: AND + OR + NOT Comb. logic level (RTL)
• Chpt 3 Logic level
– Capture sequential behavior: FSMs
Transistor level
– Convert to circuit: Register + Comb. logic Controller
• Chpt 4 Levels of digital
– Datapath components, simple datapaths design abstraction
• Chpt 5
– Capture behavior: High-level state machine Processors:
– Convert to circuit: Controller + Datapath Processor • Programmable
– Known as “RTL” (register-transfer level) design (microprocessor)
• Custom
Digital Design 2e
Copyright © 2010 2
Frank Vahid
Note: Slides with animation are denoted with a small red "a" near the animated items
5.2
0 1 0 processor 50
25
Digital Design 2e
Copyright © 2010 3
Frank Vahid
HLSMs s
8
a
8
m m' m
? // Increment Preg
m S_Inc Preg := Preg + 1
(a) (b) (c)
Digital Design 2e Note: Could have designed directly using an up-counter. But, that methodology
Copyright © 2010
is ad hoc, and won't work for more complex examples, like the next one. a 5
Frank Vahid
Example: Laser-Based Distance Measurer
T (in seconds)
laser
D
Object of
a
interest
sensor
2D = T sec * 3*108 m/sec
Digital Design 2e
Copyright © 2010 6
Frank Vahid
Example: Laser-Based Distance Measurer
T (in seconds)
B L
laser from button to laser
Laser-based
distance
sensor D 16 measurer S
to display from sensor
• Inputs/outputs
– B: bit input, from button, to begin measurement
– L: bit output, activates laser
– S: bit input, senses laser reflection
– D: 16-bit output, to display computed distance
Digital Design 2e
Copyright © 2010 7
Frank Vahid
Example: Laser-Based Distance Measurer
DistanceMeasurer from button B Laser-
L
to laser
Inputs: B (bit), S (bit) based
Outputs: L (bit), D (16 bits) distance
D 16 measurer S
Local storage: Dreg(16) to display from sensor
(required)
a
S0 ?
(first state usually
L := '0' // laser off initializes the system)
Dreg := 0 // distance is 0
Digital Design 2e
Copyright © 2010 8
Frank Vahid
Example: Laser-Based Distance Measurer
from button B Laser-
L
to laser
DistanceMeasurer based
... B' // button not pressed distance
D 16 measurer S
to display from sensor
S0 S1 ?
B
L := '0' // button
Dreg := 0 pressed
S0 S1 S2 S3
B
L := '0' L := '1' L := '0'
Dreg := 0 // laser on // laser off
Digital Design 2e
Copyright © 2010 10
Frank Vahid
Example: Laser-Based Distance Measurer
B L
fr om button to laser
DistanceMeasurer Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits) Laser-based
Local storage: Dreg, Dctr (16 bits) 16
distance
D measurer S
B' t o display from sensor
S' // no reflection
S // reflection
S0 S1 S2 S3 ?
B
L := '0' Dctr := 0 L := '1' L := '0'
Dreg := 0 // reset cycle Dctr := Dctr + 1
count // count cycles
a
S0 S1 S2 S3 S4
B S
L := '0' Dctr := 0 L := '1' L := '0' Dreg := Dctr/2
Dreg := 0 Dctr := Dctr+1 // calculate D
B' B
!(Jreg<2) Jreg<2
1 2 3
S0 S1 Jreg ? 1 2 3
B
P := '0' P := '1'
Jreg := 1 Jreg := Jreg + 1 P
Digital Design 2e
Copyright © 2010
(a) (b) 13
Frank Vahid
5.3
Digital Design 2e
Copyright © 2010 14
Frank Vahid
Ctrl/DP Example for Earlier Cycles- CountHigh a
Digital Design 2e
Controller
Copyright © 2010 15
Frank Vahid (d) 32
P
RTL Design Process
Digital Design 2e
Copyright © 2010 16
Frank Vahid
Example: Soda Dispenser from Earlier
s a
• Quick overview example.
More details of each step to come.
tot_ld ld
tot
tot_clr clr
a
8
Inputs: c (bit), a (8 bits), s (8 bits) 8 8
Outputs: d (bit) // '1' dispenses soda
Local storage: tot (8 bits) 8-bit
tot_lt_s 8-bit
c < adder
Add 8
Datapath
Init Wait
tot:=tot+a
Step 2A
d:='0' c'(tot<s) s a
c’*(tot<s)’
tot:=0 8 8
Disp
c
SodaDispenser d:='1' tot_ld
d a
tot_clr
Step 1 Controller Datapath
tot_lt_s
Digital Design 2e
Copyright © 2010
Frank Vahid
Step 2B 17
Example: Soda Dispenser
s a
• Quick overview example. 8 8
More details of each step to come. c
tot_ld
d
tot_clr
Inputs: c (bit), a (8 bits), s (8 bits) Controller Datapath
Outputs: d (bit) // '1' dispenses soda tot_lt_s
Local storage: tot (8 bits)
Step 2B
c
Add
Inputs: c, tot_lt_s (bit)
Init Wait Outputs: d, tot_ld, tot_clr (bit)
tot:=tot+a c tot_ld
c
d:='0' c'(tot<s) d
Add tot_clr
c’*(tot<s)’
tot:=0 Init Wait
tot_ld=1 tot_lt_s
Disp
d=0 c'
tot_lt_s’
c tot_lt_s
tot_clr=1
SodaDispenser d:='1'
Disp
Digital Design 2e
Copyright © 2010 Step 2C 18
Frank Vahid
Example: Soda Dispenser
• Quick overview example.
Inputs: c, tot_lt_s (bit)
More details of each step to come. Outputs: d, tot_ld, tot_clr (bit)
c tot_ld
c
Add tot_clr
d
tot_lt_s
tot_clr
Init Wait
tot_ld
tot_ld=1 tot_lt_s
s1 s0 c n1 n0 d
d=0 c
tot_lt_s
c tot_lt_s
0 0 0 0 0 1 0 0 1 tot_clr=1
0 0 0 1 0 1 0 0 1 Disp
Init
0 0 1 0 0 1 0 0 1
Controller d=1
0 0 1 1 0 1 0 0 1
0 1 0 0 1 1 0 0 0
0 1 0 1 0 1 0 0 0 Step 2C
Wait
0 1 1 0 1 0 0 0 0
0 1 1 1 1 0 0 0 0 Use controller design process
1 0 0 0 0 1 0 1 0
(Ch3) to complete the design
Add
1 1 0 0 0 0 1 0 0
Disp
Digital Design 2e
Copyright © 2010 19
Frank Vahid
RTL Design Process—Step 2A: Create a datapath
• Sub-steps
– HLSM data inputs/outputs Datapath inputs/outputs.
– HLSM local storage item Instantiated register
• "Instantiate": Add new component ("instance") to design
– Each HLSM state action and transition condition data computation
Datapath components and connections
• Also instantiate multiplexors as needed
• Need component library from which to choose
clr I A B A B I I1 I0
ld reg add cmp shift<L/R> mux2x1
Q S lt eq gt Q s0 Q
clk^ and clr=1: Q=0 S = A+B (unsigned) shiftL1: <<1 s0=0: Q=I0
clk^ and ld=1: Q=I A<B: lt=1 shiftL2: <<2 s0=1: Q=I1
else Q stays same A=B: eq=1 shiftR1: >>1
A>B: gt=1 ...
Digital Design 2e
Copyright © 2010 20
Frank Vahid
Step 2A: Create a Datapath—Simple Examples
X Y Z X X Y Z X Y Z
P P P Q P
(a) (b) (c) (d)
X Y Z X X Y Z X Y Z
DP DP
A B
add1 A B A B A B
S add1 A B A B add1 add2
S add1 add2 S S
X+Y S S
A B 0 clr I I1 I0
add2 1 ld Preg 0 clr I 0 clr I mux2x1
S Q 1 ld Preg 1 ld regQ k s0 Q
X+Y+Z Q Q a
0 clr I 0 clr I
1 ld Preg P P Q 1 ld Preg
Q Q
DP DP
P P
Digital Design 2e
Copyright © 2010 21
Frank Vahid
Laser-Based Distance Measurer—Step 2A: Create a Datapath
DistanceMeasurer Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits)
Local storage:
B' S'
S0 S1 S2 S3 S4
B S
L := '0' Dctr := 0 L := '1' L := '0' Dreg := Dctr/2
Dreg := 0 Dctr := Dctr+1 // calculate D
1 Datapath
16
a
A B
• HLSM data I/O DP I/O Add1: add(16) 16 I
S Shr1: shiftR1(16)
• HLSM local storage reg Dreg_clr 16 Q
• HLSM state action and Dreg_ld 16
L
B to laser
from button
Controller
from sensor
Dreg_clr S
Dreg_ld
Dctr_clr Datapath
Dctr_ld
D
to display
16
300 MHz Clock a
Digital Design 2e
Copyright © 2010 23
Frank Vahid
Laser-Based Distance Measurer—Step 2C: Derive the Controller FSM
1 Datapath
HLSM
16
A B
DistanceMeasurer Inputs: B (bit), S (bit) Outputs: L (bit), D (16 bits) Add1: add(16) 16 I
S Shr1: shiftR1(16)
Local storage: Q
Dreg_clr 16
Dreg_ld 16
B' S'
Dctr_clr clr I clr I
Dctr_ld ld Dctr: reg(16) ld Dreg: reg(16)
S0 S1 S2 S3 S4 Q Q
B S
16
L := '0' Dctr := 0 L := '1' L := '0' Dreg := Dctr/2 D
Dreg := 0 Dctr := Dctr+1 // calculate D
B S
B S
S0 S1 S2 S3 S4
W_d
A B A B A clr W_a
sub mul abs inc upcnt W_e
clk^ and W_e=1:
S P Q Q RF[W_a]= W_d
RF
R_a R_e=1:
R_e R_d = RF[R_a]
S = A-B P = A*B Q = |A| clk^ and clr=1: Q=0 R_d
(signed) (unsigned) clk^ and inc=1: Q=Q+1
else Q stays same
Digital Design 2e
Copyright © 2010 26
Frank Vahid
RTL Design Involving Register File or Memory
• HLSM array: Ordered list of items
– Ex: Local storage: A[4](8-bit) – 4 8-bit items
– Accessed using notation "A[i]", i is index
– A[0] := 9; A[1] := 8; A[2] := 7; A[3] := 22
• Array contents now: <9, 8, 7, 22>
• X := A[1] will set X to 8
• Note: First element's index is 0
• Array can be mapped to instantiated register file or memory
Digital Design 2e
Copyright © 2010 27
Frank Vahid
ArrayEx Inputs: (none)
(A[0] == 8)' a
Init2 A[1] := 12
12 9
11 11
A[0] == 8
I1 I0
A_s Amux
Out1 Preg := A[1] s0 Q
8
(a) A_Wa0 W_d
A_Wa1 W_a
ArrayEx Inputs: A_eq_8 A_We W_e A B
A_Ra0 A
Outputs: A_s, A_Wa0, ... Acmp
A_Ra1 R_a RF[4](11)
lt eq gt
Preg_clr = 1 A_Re R_e
A_s = 0 R_d
Init1 A_Wa1=0, A_Wa0=0
A_We = 1 A_eq_8
(A_eq_8)'
Preg_clr
A_s = 1 clr I
Init2 ld Preg
A_Wa1=0, A_Wa0=1 Preg_ld Q
A_We = 1 DP
A_Ra1=0, A_Ra0=0
A_eq_8 A_Re = 1 (b) 11
P
Out1 Preg_ld = 1
Digital Design 2e Controller
Copyright © 2010 28
Frank Vahid (c)
a
RTL Example: Video Compression – Sum of Absolute Differences
Only difference: ball moving
Frame 1 Frame 2 Frame 1 Frame 2
a
S2
• S0: wait for go i<256
• sum:=sum+abs(A[i]-B[i])
S1: initialize sum and index
(i<256)’
S3
i := i + 1
• S2: check if done ( (i<256)’ )
• S4 sadreg := sum
S3: add difference to sum,
increment index
(b)
• S4: done, write to output sad_reg
Digital Design 2e
Copyright © 2010 31
Frank Vahid
Inputs: A, B [256](8 bits); go (bit)
Outputs: sad (32 bits)
Local storage: sum, sadreg (32 bits); i (9 bits)
Array Example: Video Compression—
S0
go
!go
Sum-of-Absolute Differences
sum := 0
S1
i := 0
S2
!(i<256) i<256
sum:=sum+abs(A[i]-B[i])
S3
i := i + 1
S4 sadreg := sum
i_lt_256 A
lt 8 8
S0 go cmp B
256 9 a
go i_inc
A B
S1
sum=0 sum_clr=1
i_clr
i –
i=0 i_clr=1
8
S2 sum_ld
(i<256)’ (i_lt_256)’
(i<256)’
S2
i<256
sum:=sum+abs(A[i]-B[i])
S3
i:=i+1
Digital Design 2e
Copyright © 2010 33
Frank Vahid
Common RTL Design Pitfall Involving Storage Updates
• Questions Local storage: R, Q (8 bits)
– Value of Q after state A?
R<100 C
– Final state is C or D?
A B (R<100)'
• Answers
– Q is NOT 99 after state A R:=99 R:=R+1 D
Q:=R
– Q is 99 in state B, so final state is C
– Storage update actions in state R<100
occur simultaneously on next clock clk A B C
edge 99 100
• Thus, order actions are written is R ? 99 100 a
irrelevant
Q ? ? ?
• A's actions same if:
– Q:=R R:=99 or
– R:=99 Q:=R
Digital Design 2e
Copyright © 2010 34
Frank Vahid
Common RTL Design Pitfall Involving Storage Updates
R<100 (R<100)'
clk A B B2 D
99 100
R ? 99 100 100
Q ? ? 99 99
Digital Design 2e
Copyright © 2010 35
Frank Vahid
RTL Design Involving a Timer
• Commonly need explicit time intervals L
– Ex: Repeatedly blink LED on 1 second, off 1 second
• Pre-instantiate timer that HLSM can then use
Digital Design 2e
Copyright © 2010 38
Frank Vahid
Data Dominated RTL Design Example: FIR Filter
• FIR filter X Y
clk
– Simply a configurable weighted
sum of past input values
– y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2) y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)
• Above known as “3 tap”
Inputs: X (12 bits) Outputs: Y (12 bits)
• Tens of taps more common Local storage: xt0, xt1, xt2, c0, c1, c2 (12 bits);
• Very general filter – User sets
the constants (c0, c1, c2) to
define specific filter Init FC
3 2 2
...
xt0_ld
...
Yreg_clr
+ + Yreg_ld
Y
Yreg
Datapath for 3-tap FIR filter
Digital Design 2e
Copyright © 2010 40
Frank Vahid
Circuit vs. Microprocessor y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)
Digital Design 2e
Copyright © 2010 41
Frank Vahid
5.5
Digital Design 2e
Copyright © 2010 42
Frank Vahid
Critical Path
• Example shows four paths
– a to c through +: 2 ns
– a to d through + and *: 7 ns a b
– b to d through + and *: 7 ns
– b to d through *: 5 ns
• Longest path is thus 7 ns
2 ns
delay
5 ns
delay
• Fastest frequency
2 ns
2 ns
7 ns
7 ns
Max
– 1 / 7 ns = 142 MHz (2,7,7,5)
= 7 ns c d
a
Digital Design 2e
Copyright © 2010 43
Frank Vahid
Critical Path Considering Wire Delays
• Real wires have delay too
– Must include in critical path
• Example shows two paths clk a b
– Each is 0.5 + 2 + 0.5 = 3 ns
0.5 ns
• Trend 0.5 ns
– 1980s/1990s: Wire delays were tiny
compared to logic delays 2 ns
– But wire delays not shrinking as fast as a
logic delays 0.5 ns
3 ns
3 ns
• Wire delays may even be greater than
logic delays! c
• Must also consider register setup and
hold times, also add to path
• Then add some time to the computed
path, just to be safe
– e.g., if path is 3 ns, say 4 ns instead
Digital Design 2e
Copyright © 2010 44
Frank Vahid
A Circuit May Have Numerous Paths
• Paths can exist s a
hundreds or
thousands of n0
8-bit 8-bit
< adder
paths tot_lt_s 8
automatically very
helpful
Digital Design 2e
Copyright © 2010 45
Frank Vahid
5.7
Memory Components
• RTL design instantiates
datapath components to
create datapath, controlled
by a controller
– Some components are used
M words
outside the controller and DP
• MxN memory
– M words, N bits wide each
• Several varieties of memory,
N-bits
which we now introduce wide each
M× N memory
Digital Design 2e
Copyright © 2010 51
Frank Vahid
Random Access Memory (RAM)
• RAM – Readable and writable memory 32 32
W_data R_data
– “Random access memory” 4 4
• Strange name—Created several decades ago to W_addr R_addr
contrast with sequentially-accessed storage like W_en R_en
tape drives 16×32
register file
– Logically same as register file—Memory with
address inputs, data inputs/outputs, and control Register file from Chpt. 4
• RAM usually one port; RF usually two or more
– RAM vs. RF
32
• RAM typically larger than about 512 or 1024 words data
10
• RAM typically stores bits using a bit storage addr
approach that is more efficient than a flip-flop 1024 × 32
rw RAM
• RAM typically implemented on a chip in a square
en
rather than rectangular shape—keeps longest
wires (hence delay) short
RAM block symbol
Digital Design 2e
Copyright © 2010 52
Frank Vahid
RAM Internal Structure
Let A = log2M wdata(N-1) wdata(N-2) wdata0
32
data
10 word bit storage
addr block
1024x32 enable
rw RAM d0 (aka “cell”)
en
addr0 a0 word
addr1 a1 AxM
d1
decoder
addr(A-1) a(A-1) data cell
word word
e d(M-1) enable enable
clk
Combining rd and wr rw data
en
data lines rw to all cells
wdata
rdata0
wdata0
rdata
(N-1)
(N-1)
addr
decoder
en addr(A-1) a(A-1) data cell
word 0
Digital Design 2e
enable
Copyright © 2010 54
Frank Vahid
Static RAM (SRAM)
wdata(N-1) wdata(N-2) wdata0
32 Let A = log2 M
data word bit storage
10 enable block ,, ,,
addr d0 (aka cell )
1024x32 addr0 a0 word
rw RAM addr1 a1 A× M
d1
addr
decoder
en addr(A-1) a(A-1) data cell
word word
e d(M-1) enable enable
clk
rw data
en
rw to all cells
SRAM cell
• “Static” RAM cell rdata(N-1) rdata(N-2) rdata0
data data’
– Reading this cell 1 1
• Somewhat trickier d
• When rw set to read, the RAM logic sets
both data and data’ to 1 1 0
addr
decoder
en addr(A-1) a(A-1) data cell
word word
e d(M-1) enable enable
clk
en
rw to all cells
rw data
DRAM cell
• “Dynamic” RAM cell rdata(N-1) rdata(N-2) rdata0 data
cell
– 1 transistor (rather than 6)
word
– Relies on large capacitor to store bit enable
d
capacitor
• Write: Transistor conducts, data voltage slowly
level gets stored on top plate of capacitor discharging
addr
data
en
rw
wire 16
analog-to- digital-to-
digital 12 analog
ad_buf Ra RrwRen wire
microphone converter converter
ad_ld processor da_ld
speaker
• Behavior
– Record: Digitize sound, store as series of 4096 12-bit digital values in RAM
• We’ll use a 4096x16 RAM (12-bit wide RAM not common)
– Play back later
– Common behavior in telephone answering machine, toys, voice recorders
• To record, processor should read a-to-d, store read values into successive
RAM words
– To play, processor should read successive RAM words and enable d-to-a
Digital Design 2e
Copyright © 2010 59
Frank Vahid
RAM Example: Digital Sound Recorder
4096x16
• RTL design of processor RAM
– Create HLSM
16
– Begin with the record behavior analog-to-
12
digital-to-
digital ad_buf Ra Rrw Ren analog
– Create local storage a converter converter
ad_ld processor da_ld
• Stores current address,
ranges from 0 to 4095 (thus
need 12 bits) Record behavior
Local register: a, Rareg (12 bits)
– Create state machine that
a<4095
counts from 0 to 4095 using a S T
• For each a a:=0 ad_ld:=‘1’ a
ad_buf:=‘1’
– Read analog-to-digital conv. Rareg:=a U
» ad_ld:=‘1’, ad_buf:=‘1’ Rrw:=‘1’ a:=a+1
Ren:=‘1’
– Write to RAM at address a
(a<4095)’
» Rareg:=a, Rrw:=‘1’,
Ren:=‘1’
Digital Design 2e
Copyright © 2010 60
Frank Vahid
RAM Example: Digital Sound Recorder
4096x16
– Now create play behavior RAM data bus
– Use local register a again,
create state machine that 16
counts from 0 to 4095 again analog-to-
digital 12
digital-to-
analog
ad_buf Ra Rrw Ren
• For each a converter converter
ad_ld processor da_ld
– Read RAM
– Write to digital-to-analog conv.
• Note: Must write d-to-a one
Play behavior
cycle after reading RAM, when
Local register: a,Rareg (12 bits)
the read data is available on
the data bus a<4095
V W
– The record and play state a:=‘0’
a
ad_buf:=‘0’
machines would be parts of a Rareg:=a
X
larger state machine controlled Rrw=‘0’
Ren=‘1’
by signals that determine when da_ld:=‘1’
a:=a+1
to record or play (a<4095)’
Digital Design 2e
Copyright © 2010 61
Frank Vahid
Read-Only Memory – ROM
• Memory that can only be read from, not 32
data
written to 10
addr
1024× 32
– Data lines are output only rw RAM
– No need for rw input en
Digital Design 2e
Copyright © 2010 62
Frank Vahid
Read-Only Memory – ROM
32
data
10 1024x32
addr Let A = log2M
ROM
en
word bit storage
enable block
ROM block symbol d0 (aka “cell”)
addr0 a0 word
addr addr1 a1 AxM
d1
decoder
addr(A-1) a(A-1) data
word word
e d(M-1) enable enable
clk
en data
Digital Design 2e
Copyright © 2010 63
Frank Vahid
ROM Types
• If a ROM can only be read, how Let A = log2 M
word bit storage
first place?
d1
addr
decoder
data
addr(A-1) a(A-1) cell
word word
e d(M-1) enable enable
programming
data(N-1) data(N-2) data0
– Several methods
• Mask-programmed ROM 1 data line 0 data line
ROM
d0 (a cell )
addr0 a0 word
addr1 a1 A × M
d1
addr
decoder
data
addr(A-1) a(A-1)
as 1s enable
Digital Design 2e
Copyright © 2010 65
Frank Vahid
ROM Types
• Erasable Programmable ROM Let A = log2 M
word bit storage
(EPROM)
enable block
,, ,,
d0 (a cell )
addr0 a0 word
addr1 a1 A × M
d1
addr
– Uses “floating-gate transistor” in each cell
decoder
data
addr(A-1) a(A-1) cell
word word
e d(M-1) enable enable
floating-gate
• Electrons become trapped in the gate data line data line
transistor
• Only done for cells that should store 0 cell cell
• Other cells (without electrons trapped in 1 10 a
Digital Design 2e
Copyright © 2010 66
Frank Vahid
ROM Types
• Electronically-Erasable Programmable ROM
(EEPROM)
– Similar to EPROM
• Uses floating-gate transistor, electronic programming to
trap electrons in certain cells
– But erasing done electronically, not using UV light
– Erasing done one word at a time
• Flash memory
– Like EEPROM, but all words (or large blocks of
words) can be erased simultaneously 32
data
– Became very common starting in late 1990s 10
addr
• Both types are in-system programmable
en 1024x32
– Can be programmed with new stored bits while in the EEPROM
system in which the ROM operates write
• Requires bi-directional data lines, and write control input busy
• Also need busy output to indicate that erasing is in
progress – erasing takes some time
Digital Design 2e
Copyright © 2010 67
Frank Vahid
ROM Example: Talking Doll
“Hello there!” audio
divided into 4096 speaker
“Hello there!”
4096x16 ROM
samples, stored
in ROM “Hello there!”
16 a
digital-to-
analog vibration
Ra Ren converter sensor
da_ld
processor
v
• Doll plays prerecorded message, triggered by vibration
– Message must be stored without power supply Use a ROM, not a RAM, because
ROM is nonvolatile
• And because message will never change, may use a mask-programmed ROM or OTP ROM
– Processor should wait for vibration (v=1), then read words 0 to 4095 from the ROM,
writing each to the d-to-a
Digital Design 2e
Copyright © 2010 68
Frank Vahid
ROM Example: Talking Doll
Local register: a, Rareg (12 bits)
4096x16 ROM
v a<4095
a:=‘0’ S T
16 R areg:=a
digital-to- Ren:=‘1’
analog U a
Ra Ren converter v’
da_ld:=‘1’
da_ld (a<4095)’ a:=a+1
processor
v
• HLSM
– Create state machine that waits for v=1, and then counts from 0 to
4095 using a local storage a
– For each a, read ROM, write to digital-to-analog converter
Digital Design 2e
Copyright © 2010 69
Frank Vahid
ROM Example: Digital Telephone Answering Machine Using a Flash Memory
• Want to record the outgoing
announcement 4096x16 Flash
– When rec=1, record digitized “We’re not home.”
erase
busy
addr
data
rw
sound in locations 0 to 4095
en
– When play=1, play those
stored sounds to digital-to- 16
analog converter analog-to-
digital 12 digital-to-
ad_buf Ra Rrw Ren er bu
• What type of memory? converter analog
– Should store without power ad_ld processor converter
da_ld
supply – ROM, not RAM
– Should be in-system rec
programmable – EEPROM record play
or Flash, not EPROM, OTP
microphone speaker
ROM, or mask-programmed
ROM
– Will always erase entire
memory when
reprogramming – Flash
better than EEPROM
Digital Design 2e
Copyright © 2010 70
Frank Vahid
ROM Example: Digital Telephone Answering Machine Using a Flash Memory
• HLSM 4096x16 Flash
bu=0 record
rec
play
Digital Design 2e
Copyright © 2010 71
Frank Vahid
Blurring of Distinction Between ROM and RAM
• We said that
– RAM is readable and writable ROM Flash RAM
a
EEPROM NVRAM
– ROM is read-only
• But some ROMs act almost like RAMs
– EEPROM and Flash are in-system programmable
• Essentially means that writes are slow
– Also, number of writes may be limited (perhaps a few million times)
• And, some RAMs act almost like ROMs
– Non-volatile RAMs: Can save their data without the power supply
• One type: Built-in battery, may work for up to 10 years
• Another type: Includes ROM backup for RAM – controller writes RAM contents to
ROM before turning off
• New memory technologies evolving that merge RAM and ROM benefits
– e.g., MRAM
• Bottom line
– Lot of choices available to designer, must find best fit with design goals
Digital Design 2e
Copyright © 2010 72
Frank Vahid
5.8
Queues (FIFOs)
• A queue is another component
sometimes used during RTL
back front
design
• Queue: A list written to at the
back, from read from the front
– Like a list of waiting restaurant
customers write items read (and
to the back remove) items
• Writing called a push, reading of the queue from front of
called a pop the queue
Digital Design 2e
Copyright © 2010 73
Frank Vahid
Queues
7 6 5 4 3 2 1 0
• Queue has addresses, and two
pointers: rear and front
– Initially both point to 0
• Push (write) rf
7 6 5 4 3 2 1 0
– Item written to address pointed to
by rear A
A a
– rear incremented
• Pop (read) r f
– Item read from address pointed to 7 6 5 4 3 2 1 0
by front
– front incremented B B A a
7) B A
r f
Digital Design 2e
Copyright © 2010 74
Frank Vahid
Queues
• Treat memory as a circle 7 6 5 4 3 2 1 0
– If front or rear reaches 7, next (incremented)
value should be 0 rather than 8 (for a queue B A
with addresses 0 to 7)
• Two conditions of interest r f
• Causes front=rear B
– Empty queue – no items
f
• No pops allowed until a push occurs
2 r 6
• Causes front=rear
r
Digital Design 2e
Copyright © 2010 75
Frank Vahid
Queue Implementation
• Can use register file for
8x16 register file
item storage wdata 16
wdata rdata
16 rdata
Controller
up counter up counter
set control lines for rear front
reset
pushes and pops, and
also detect full and empty eq
=
full
situations
– FSM for controller not empty
8-word 16-bit queue
shown
Digital Design 2e
Copyright © 2010 76
Frank Vahid
Common Uses of a Queue
• Computer keyboard
– Pushes pressed keys onto queue, meanwhile pops and sends to
computer
• Digital video recorder
– Pushes captured frames, meanwhile pops frames, compresses
them, and stores them
• Computer network routers
– Pushes incoming packets onto queue, meanwhile pops packets,
processes destination information, and forwards each packet out
over appropriate port
Digital Design 2e
Copyright © 2010 77
Frank Vahid
Queue Usage Example
7 6 5 4 3 2 1 0
rf
– Note how rear and front 7 6 5 4 3 2 1 0
pointers move 1. Aft er pushing 3 2 7 5 8 5 9
– Note that popping doesn’t 9, 5, 8, 5, 7, 2, 3
to 0 6 3 2 7 5 8 5 9
3. Aft er pushing 6
• Note: pushing a full queue is
f r
an error 7 6 5 4 3 2 1 0
rf
Digital Design 2e
5. Aft er pushing 4 ERROR! Pushing a full queue
Copyright © 2010 78
Frank Vahid r esults in unknown state.
5.9
Multiple Processors
• Using multiple processors
can ease design from ButtonDebouncer
– Keeps distinct behaviors button
Bin
B
L
Bout
separate to laser
Laser-based
– Ex: Laser-based distance distance
D 16 measurer S
measurer with button
debounce to display from sensor
Digital Design 2e
Copyright © 2010 79
Frank Vahid
Interfacing Multiple Processors
• Use signal, register, or other component outside processors
– Known as global
• Common methods use global...
– control signal, data signal, register, register file, queue
• Typically all multiple processors and clocked globals use
same clock
– Synchronized
Digital Design 2e
Copyright © 2010 80
Frank Vahid
Ex: Temperature Statistics with Multiple Processors
• 16-bit unsigned input T from temperature sensor, 16-bit output A. Sample T
every 1 second. Compute output A every minute, should equal average of most
recent 64 samples.
• Single HLSM: Complicated
• Instead, two HLSMs (and hence two processors) and shared register file
– Tsample HLSM: Store T into successive RF address, once per sec.
– Avg HLSM: Compute and output average of all 64 RF words, once per min.
– Note that each uses distinct timer
T W_d TempStats
Keeping the T
W_d
sampling and W_a W_a R_a R_a A
a
A
averaging W_e W_e R_e R_e
Digital Design 2e
Copyright © 2010 81
Frank Vahid
Ex: Digital Camera with Mult. Processors and Queue
• Read and Compress processors (Ch 1)
– Compress may take longer, depends on picture
– Use queue, read can push additional pics (up to 8)
– Likewise, use queue between Compress and Store
Image sensor 8 8
wdata rdata
Read wr Compress Queue Store
circuit rd Memory
full Queue empty circuit [8](8) circuit
[8](8)
a
Digital Design 2e
Copyright © 2010 82
Frank Vahid
5.10
Province 2
Province 3
CityF
Province 1
a
– Organization with few items at the top, with CityB
each item decomposed into other items CityE
– Common example: Country CityG
CityC
Country A
• 1 item at top (the country)
• Country item decomposed into
state/province items
• Each state/province item decomposed into
city items
Province 2
Province 3
Province 1
• Hierarchy helps us manage complexity
– To go from transistors to gates, muxes,
decoders, registers, ALUs, controllers,
datapaths, memories, queues, etc. Country A
– Imagine trying to comprehend a controller Map showing just top two levels
and datapath at the level of gates
of hierarchy
Digital Design 2e
Copyright © 2010 83
Frank Vahid
Hierarchy and Abstraction
• Abstraction
– Hierarchy often involves not just
grouping items into a new item, but also
associating higher-level behavior with
the new item, known as abstraction a7.. a0 b7.. b0
• Ex: 8-bit adder has understandable high-
8-bit adder ci
level behavior—adds two 8-bit binary
numbers
co s7.. s0
– Frees designer from having to
remember, or even understand, the
lower-level details
Digital Design 2e
Copyright © 2010 84
Frank Vahid
Hierarchy and Composing Larger Components from Smaller Versions
Digital Design 2e
Copyright © 2010 85
Frank Vahid
Hierarchy and Composing Larger Components from Smaller Versions
data(31..0)
10
1024x32
ROM
data
Digital Design 2e 32
Copyright © 2010 86
Frank Vahid
Hierarchy and Composing Larger Components from Smaller Versions
11
• Creating memory with more words a9..a0
addr
addr
– Put memories on top of one another until the number 1x2 d0 1024x8
of desired words is achieved a10
i0 dcd ROM
– Use decoder to select among the memories
e d1 en data
• Can use highest order address input(s) as decoder input
• Although actually, any address line could be used 8
– Example: Compose 1024x8 memories into 2048x8
en
memory addr
1024x8
11 ROM
2048x8 en data
en addr
ROM
a10 a9 a8 a0 8
data
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 addr 8 a
a 0 0 0 0 0 0 0 0 0 1 0 1024x8
Pr ROM
0 1P
o 1o1 1 1 0
r 1 1 1 1 vin en data
a10 just chooses
0 vin
a
1 1 1 1 1 1 c1 1 1 1
which memory to To create memory with more
access 1 0 0 0 0 0 0 0 0 0 0 words and wider words, can first
1 0 0 0 0 0 0 0 0 0 1 addr
compose to enough words, then
1 0 0 0 0 0 0 0 0 1 0 1024x8
ROM
widen.
Digital Design 2e
Copyright © 2010 1 1 1 1 1 1 1 1 1 1 0 en data 87
Frank Vahid 1 1 1 1 1 1 1 1 1 1 1
Chapter Summary
– Modern digital design involves creating processor-level components
– High-level state machines
– RTL design process
• 1. Capture behavior: Use HLSM
• 2. Convert to circuit
– A. Create datapath B. Connect DP to controller C. Derive controller FSM
– More RTL design
• More components, arrays, timers, control vs. data dominated
– Determining fastest clock frequency
• By finding critical path
– Behavioral-level design – C to gates
• By using method to convert C (subset) to high-level state machine
– Memory components (RAM, ROM)
– Queues
– Multiple processors
– Hierarchy: A key concept used throughout Chapters 2-5
Digital Design 2e
Copyright © 2010 88
Frank Vahid