0% found this document useful (0 votes)
20 views57 pages

DDCArv Ch7

Chapter 7 discusses the microarchitecture of multicycle RISC-V processors, comparing single-cycle and multicycle designs. It highlights the advantages of multicycle processors, including higher clock speeds and hardware reuse, while detailing the steps involved in instruction fetching, operand reading, memory addressing, and data writing. The chapter also covers the control mechanisms and finite state machine (FSM) involved in executing various instructions.

Uploaded by

7jmfwksw5c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views57 pages

DDCArv Ch7

Chapter 7 discusses the microarchitecture of multicycle RISC-V processors, comparing single-cycle and multicycle designs. It highlights the advantages of multicycle processors, including higher clock speeds and hardware reuse, while detailing the steps involved in instruction fetching, operand reading, memory addressing, and data writing. The chapter also covers the control mechanisms and finite state machine (FSM) involved in executing various instructions.

Uploaded by

7jmfwksw5c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

Digital Design &

Computer Architecture
Sarah Harris & David Harris

Chapter 7:
Microarchitecture
Chapter 7: Microarchitecture

Multicycle RISC-V
Processor
Single- vs. Multicycle Processor
• Single-cycle:
+ simple
- cycle time limited by longest instruction (lw)
- separate memories for instruction and data
- 3 adders/ALUs
• Multicycle processor addresses these issues by
breaking instruction into shorter steps
o shorter instructions take fewer steps
o can re-use hardware
o cycle time is faster

Digital Design & Computer Architecture Microarchitecture


Single- vs. Multicycle Processor
• Single-cycle:
+ simple
- cycle time limited by longest instruction (lw)
- separate memories for instruction and data
- 3 adders/ALUs
Same design steps
• Multicycle: as single-cycle:
+ higher clock speed • first datapath
+ simpler instructions run faster • then control
+ reuse expensive hardware on multiple cycles
- sequencing overhead paid many times

Digital Design & Computer Architecture Microarchitecture


Single- vs. Multicycle Processor

Digital Design & Computer Architecture Microarchitecture


Multicycle State Elements
Replace separate Instruction and Data
memories with a single unified memory –
more realistic
CLK CLK
CLK
WE WE3
A1 RD1
PCNext PC RD
A
EN
Instr / Data A2 RD2
Memory
A3 Register
WD
WD3 File

Digital Design & Computer Architecture Microarchitecture


Multicycle Datapath: Instruction Fetch
STEP 1: Fetch instruction

IRWrite
CLK CLK CLK
CLK
WE WE3
Instr A1 RD1
PCNext PC RD
A EN
Instr / Data A2 RD2
Memory
A3 Register
WD
WD3 File

Digital Design & Computer Architecture Microarchitecture


Multicycle Datapath: lw Get Sources
STEP 2: Read source operand from RF and
extend immediate

IRWrite ImmSrc1:0
CLK CLK CLK CLK
CLK
WE 19:15 Rs1 WE3 A
Instr A1 RD1
PCNext PC RD
A EN
Instr / Data A2 RD2
Memory
A3 Register
WD
WD3 File

31:7 Extend ImmExt

Digital Design & Computer Architecture Microarchitecture


Multicycle Datapath: lw Address
STEP 3: Compute the memory address

IRWrite ImmSrc1:0 ALUControl2:0


CLK CLK CLK CLK
CLK CLK
WE 19:15 Rs1 WE3 A SrcA
Instr A1 RD1
PCNext PC RD
A ALUResult ALUOut

ALU
EN
Instr / Data A2 RD2 SrcB
Memory
A3 Register
WD
WD3 File

31:7 Extend ImmExt

Digital Design & Computer Architecture Microarchitecture


Multicycle Datapath: lw Memory Read
STEP 4: Read data from memory

AdrSrc IRWrite ImmSrc1:0 ALUControl2:0


CLK CLK CLK CLK
CLK
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1
0 Adr RD
A ALUResult ALUOut

ALU
EN EN
1
Instr / Data A2 RD2 SrcB
ReadData

Memory
A3 Register
WD
WD3 File

CLK
31:7 Extend ImmExt
Data

Digital Design & Computer Architecture Microarchitecture


Multicycle Datapath: lw Write Register
STEP 5: Write data back to register file

AdrSrc IRWrite RegWrite ImmSrc1:0 ALUControl2:0 ResultSrc1:0


CLK CLK CLK CLK
CLK
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1
Instr / Data A2 RD2 SrcB 01
ReadData

Memory 11:7 Rd 10
A3 Register
WD
WD3 File

CLK
31:7 Extend ImmExt
Data

Digital Design & Computer Architecture Microarchitecture


Multicycle Datapath: Increment PC
STEP 6: Increment PC: PC = PC+4

PCWrite AdrSrc IRWrite RegWrite ImmSrc1:0 ALUSrcA1:0 ALUControl2:0 ResultSrc1:0

ALUSrcB1:0

CLK CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1
Instr / Data A2 RD2 00 SrcB 01
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Chapter 7: Microarchitecture

Multicycle Datapath:
Other Instructions
Multicycle Datapath: sw
Write data in rs2 to memory

PCWrite AdrSrc MemWrite IRWrite RegWrite ImmSrc1:0 ALUSrcA1:0 ALUControl2:0 ResultSrc1:0

ALUSrcB1:0

CLK CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Multicycle Datapath: beq
Calculate branch target address:
BTA = PC + imm
PCWrite AdrSrc MemWrite IRWrite RegWrite ImmSrc1:0 ALUSrcA1:0 ALUControl2:0 ResultSrc1:0

ALUSrcB1:0 Zero
CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
A ALUResult ALUOut

ALU
EN EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

PC is updated in Fetch stage, so need to save old (current) PC

Digital Design & Computer Architecture Microarchitecture


Multicycle RISC-V Processor
CLK

PCWrite
AdrSrc Control
MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

Zero

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Chapter 7: Microarchitecture

Multicycle Control
Multicycle Control
High-Level View Low-Level View
CLK Zero
Branch PCWrite
PCUpdate
PCWrite
AdrSrc Control RegWrite
MemWrite
MemWrite Unit Main
FSM IRWrite
IRWrite ResultSrc1:0 ResultSrc1:0
ALUControl2:0 op6:0
5
ALUSrcB1:0
ALUSrcA1:0
ALUSrcB1:0
6:0 AdrSrc
op ALUSrcA1:0
14:12 ALU Decoder
funct3 ImmSrc
Instr

ALUOp1:0
30
1:0 same as
funct75 RegWrite single-cycle
ALU
funct32:0 ALUControl2:0
Decoder
Zero funct75
Zero

Instr
op6:0 ImmSrc1:0
Decoder

Digital Design & Computer Architecture Microarchitecture


Multicycle Control: Instruction Decoder
op Instruction ImmSrc
3 00 Instr
lw op6:0 ImmSrc1:0
35 sw 01 Decoder
51 R-type XX
99 beq 10

Digital Design & Computer Architecture Microarchitecture


Multicycle Control: Main FSM
Zero
Branch PCWrite
PCUpdate To declutter FSM:
• Write enable
signals (RegWrite,
RegWrite
MemWrite,
MemWrite IRWrite,
Main
FSM IRWrite PCUpdate, and
ResultSrc1:0 Branch) are 0 if not
op6:0 ALUSrcB1:0 listed in a state.
ALUSrcA1:0
AdrSrc • Other signals are
don’t care if not
ALUOp1:0 listed in a state

Digital Design & Computer Architecture Microarchitecture


Main FSM: Fetch
CLK
Reset
PCWrite
S0: Fetch AdrSrc Control
AdrSrc = 0 MemWrite Unit
IRWrite
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite
S0: Fetch 0 0 0 1 0 xx xx xx xxx xx
Zero

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: Decode
CLK
Reset
PCWrite
Recall that ImmSrc is determined
S0: Fetch S1: Decode AdrSrc Control by the Instruction Decoder
AdrSrc = 0 MemWrite Unit
IRWrite
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite
S1: Decode 0 x 0 0 0 00 xx xx xxx xx
Zero

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
A ALUResult ALUOut

ALU
EN EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: Address
Reset

CLK
S0: Fetch S1: Decode
AdrSrc = 0
IRWrite PCWrite
AdrSrc Control
MemWrite Unit
IRWrite ResultSrc1:0
op = 0000011 (lw) ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
S2: MemAdr 14:12
ALUSrcA = 10 funct3 ImmSrc
1:0
ALUSrcB = 01 30
ALUOp = 00 funct75 RegWrite

Zero
S2: MemAdr 0 x 0 0 0 00 10 01 000 xx
Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
A ALUResult ALUOut

ALU
EN EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: Read Memory
Reset

S0: Fetch S1: Decode


AdrSrc = 0
IRWrite

op = 0000011 (lw)

S2: MemAdr
ALUSrcA = 10
ALUSrcB = 01
ALUOp = 00

op =
0000011
(lw)

S3: MemRead
ResultSrc = 00
AdrSrc = 1

Digital Design & Computer Architecture Microarchitecture


Main FSM: Read Memory Datapath
CLK
S3: MemRead
ResultSrc = 00 PCWrite
AdrSrc = 1 AdrSrc Control
MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

S3: MemRead 0 1 0 0 0 00 xx xx xxx 00


Zero

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: Write RF
Reset

S0: Fetch S1: Decode


AdrSrc = 0
IRWrite

op = 0000011 (lw)

S2: MemAdr
ALUSrcA = 10
ALUSrcB = 01
ALUOp = 00

op =
0000011
(lw)

S3: MemRead
ResultSrc = 00
AdrSrc = 1

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: Write RF Datapath
CLK
S4: MemWB
ResultSrc = 01 PCWrite
RegWrite AdrSrc Control
MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

S4: MemWB 0 x 0 0 Zero 1 00 xx xx xxx 01

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: Fetch Revisited
Reset

S0: Fetch
AdrSrc = 0 S1: Decode
IRWrite
ALUSrcA = 00
ALUSrcB =10
ALUOp = 00
ResultSrc = 10
PCUpdate

Calculate PC+4
op = 0000011 (lw)

during Fetch stage S2: MemAdr


ALUSrcA = 10

(ALU isn’t being ALUSrcB = 01


ALUOp = 00

used) op =
0000011
(lw)

S3: MemRead
ResultSrc = 00
AdrSrc = 1

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: Fetch (PC+4) Datapath
CLK
Reset

S0: Fetch PCWrite


Fetch Instruction and
AdrSrc = 0
IRWrite
AdrSrc Control
MemWrite Unit
Increment PC
ALUSrcA = 00
IRWrite ResultSrc1:0
ALUSrcB =10
ALUOp = 00 ALUControl2:0
ResultSrc = 10 ALUSrcB1:0
6:0
PCUpdate op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

S0: Fetch 1 0 0 1 Zero 0 xx 00 10 000 10

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Chapter 7: Microarchitecture

Multicycle Control:
Other Instructions
Main FSM: sw
Reset

S0: Fetch
AdrSrc = 0 S1: Decode
IRWrite
ALUSrcA = 00
ALUSrcB =10
ALUOp = 00
ResultSrc = 10
PCUpdate

op = 0000011 (lw)
OR
op = 0100011 (sw)

S2: MemAdr
ALUSrcA = 10
ALUSrcB = 01
ALUOp = 00

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite


ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: sw Datapath
CLK

S5: MemWrite PCWrite


ResultSrc = 00 AdrSrc Control
AdrSrc = 1 MemWrite Unit
MemWrite IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc1:0
30
funct75 RegWrite

S5: MemWrite 0 1 1 0 Zero 0 01 xx xx xxx 00

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: R-Type Execute
Reset

S0: Fetch
AdrSrc = 0 S1: Decode
IRWrite
ALUSrcA = 00
ALUSrcB =10
ALUOp = 00
ResultSrc = 10
PCUpdate

op = 0000011 (lw) op =
OR 0110011
op = 0100011 (sw) (R-type)

S2: MemAdr S6: ExecuteR


ALUSrcA = 10 ALUSrcA = 10
ALUSrcB = 01 ALUSrcB = 00
ALUOp = 00 ALUOp = 10

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite


ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: R-Type Execute Datapath
CLK
S6: ExecuteR
ALUSrcA = 10 PCWrite
ALUSrcB = 00
AdrSrc Control
ALUOp = 10
MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc1:0
30
funct75 RegWrite

S6: ExecuteR 0 x 0 0 0 xx 10 00 varies xx


Zero

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: R-Type ALU Write Back
Reset

S0: Fetch
AdrSrc = 0 S1: Decode
IRWrite
ALUSrcA = 00
ALUSrcB =10
ALUOp = 00
ResultSrc = 10
PCUpdate

op = 0000011 (lw) op =
OR 0110011
op = 0100011 (sw) (R-type)

S2: MemAdr S6: ExecuteR


ALUSrcA = 10 ALUSrcA = 10
ALUSrcB = 01 ALUSrcB = 00
ALUOp = 00 ALUOp = 10

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite S7: ALUWB


ResultSrc = 00 ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1 RegWrite
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: R-Type ALU Write Back
CLK
S7: ALUWB
ResultSrc = 00
RegWrite PCWrite
AdrSrc Control
MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

S7: ALUWB 0 x 0 0 Zero 1 xx xx xx xxx 00

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: beq
• Need to calculate:
• Branch Target Address
• rs1 - rs2 (to see if equal)
• ALU isn’t being used in Decode stage
• Use it to calculate Target Address (PC + imm)

Digital Design & Computer Architecture Microarchitecture


Main FSM: Decode Revisited
Reset

S0: Fetch
AdrSrc = 0 S1: Decode
IRWrite ALUSrcA = 01
ALUSrcA = 00 ALUSrcB = 01
ALUSrcB =10 ALUOp = 00
ALUOp = 00
ResultSrc = 10
PCUpdate

op = 0000011 (lw) op =
OR 0110011
op = 0100011 (sw) (R-type)

S2: MemAdr
ALUSrcA = 10
S6: ExecuteR
ALUSrcA = 10 Read Registers and
Calculate Target Address (PC+imm)
ALUSrcB = 01 ALUSrcB = 00
ALUOp = 00 ALUOp = 10

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite S7: ALUWB


ResultSrc = 00 ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1 RegWrite
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: Decode (Target Address)
S1: Decode
ALUSrcA = 01
CLK Read Registers and
ALUSrcB = 01
ALUOp = 00
PCWrite
AdrSrc Control
Calculate Target Address (PC+imm)
MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

S1: Decode 0 X 0 0 0 varies 01 01 000 XX


Zero

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: beq
Reset

S0: Fetch
AdrSrc = 0 S1: Decode
IRWrite ALUSrcA = 01
ALUSrcA = 00 ALUSrcB = 01
ALUSrcB =10 ALUOp = 00
ALUOp = 00
ResultSrc = 10
PCUpdate

op = 0000011 (lw) op =
op =
OR 0110011
1100011
op = 0100011 (sw) (R-type)
(beq)

S2: MemAdr S6: ExecuteR S10: BEQ


ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 10
ALUSrcB = 01 ALUSrcB = 00 ALUSrcB = 00
ALUOp = 00 ALUOp = 10 ALUOp = 01
ResultSrc = 00
Branch

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite S7: ALUWB


ResultSrc = 00 ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1 RegWrite
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: beq Datapath
S10: BEQ
ALUSrcA = 10
CLK Compare registers and
ALUSrcB = 00
ALUOp = 01
PCWrite
AdrSrc Control
Send Target PC (ALUOut) to PCNext
ResultSrc = 00
MemWrite Unit
Branch
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

S10: BEQ 1 if taken X 0 0 Zero 0 10 10 00 001 00

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Chapter 7: Microarchitecture

Extending the RISC-V


Multicycle Processor
Main FSM: I-Type ALU Execute
Reset

S0: Fetch
AdrSrc = 0 S1: Decode
IRWrite ALUSrcA = 01
ALUSrcA = 00 ALUSrcB = 01
ALUSrcB =10 ALUOp = 00
ALUOp = 00
ResultSrc = 10
PCUpdate

op = 0000011 (lw) op = op =
op =
OR 0110011 0010011
1100011
op = 0100011 (sw) (R-type) (I-type ALU)
(beq)

S2: MemAdr S6: ExecuteR S8: ExecuteI S10: BEQ


ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 10
ALUSrcB = 01 ALUSrcB = 00 ALUSrcB = 01 ALUSrcB = 00
ALUOp = 00 ALUOp = 10 ALUOp = 10 ALUOp = 01
ResultSrc = 00
Branch

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite S7: ALUWB


ResultSrc = 00 ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1 RegWrite
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: I-Type ALU Exec. Datapath
CLK
S8: ExecuteI
ALUSrcA = 10 PCWrite
ALUSrcB = 01 AdrSrc Control
ALUOp = 10 MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

S8: ExecuteI 0 x 0 0 0 xx 10 01 varies xx


Zero

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: jal
Reset

S0: Fetch
AdrSrc = 0 S1: Decode
IRWrite ALUSrcA = 01
ALUSrcA = 00 ALUSrcB = 01
ALUSrcB =10 ALUOp = 00
ALUOp = 00
ResultSrc = 10
PCUpdate

op = 0000011 (lw) op = op = op =
op =
OR 0110011 0010011 1101111
1100011
op = 0100011 (sw) (R-type) (I-type ALU) (jal)
(beq)

S2: MemAdr S6: ExecuteR S8: ExecuteI S9: JAL S10: BEQ
ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 01 ALUSrcA = 10
ALUSrcB = 01 ALUSrcB = 00 ALUSrcB = 01 ALUSrcB = 10 ALUSrcB = 00
ALUOp = 00 ALUOp = 10 ALUOp = 10 ALUOp = 00 ALUOp = 01
ResultSrc = 00 ResultSrc = 00
PCUpdate Branch

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite S7: ALUWB


ResultSrc = 00 ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1 RegWrite
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Main FSM: jal Datapath
S9: JAL
Calculate PC + 4 and
Send Target Address (ALUOut) to PCNext
CLK
ALUSrcA = 01
ALUSrcB = 10 PCWrite
ALUOp = 00
AdrSrc Control
ResultSrc = 00
PCUpdate MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

S9: JAL 1 x 0 0 Zero 0 11 01 10 000 00

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Main FSM: jal
Reset

S0: Fetch
PC + 4 is AdrSrc = 0
IRWrite
S1: Decode
ALUSrcA = 01
written to rd ALUSrcA = 00
ALUSrcB =10
ALUSrcB = 01
ALUOp = 00

in S7: ALUWB ALUOp = 00


ResultSrc = 10
PCUpdate

op = 0000011 (lw) op = op = op =
op =
OR 0110011 0010011 1101111
1100011
op = 0100011 (sw) (R-type) (I-type ALU) (jal)
(beq)

S2: MemAdr S6: ExecuteR S8: ExecuteI S9: JAL S10: BEQ
ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 01 ALUSrcA = 10
ALUSrcB = 01 ALUSrcB = 00 ALUSrcB = 01 ALUSrcB = 10 ALUSrcB = 00
ALUOp = 00 ALUOp = 10 ALUOp = 10 ALUOp = 00 ALUOp = 01
ResultSrc = 00 ResultSrc = 00
PCUpdate Branch

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite S7: ALUWB


ResultSrc = 00 ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1 RegWrite
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Multicycle Processor Main FSM
State Datapath mOp
Fetch Instr ĸ Mem[PC]; PC ĸ PC+4
Decode ALUOut ĸ PCTarget Reset
MemAdr ALUOut ĸ rs1 + imm
MemRead Data ĸ Mem[ALUOut] S0: Fetch
MemWB rd ĸ Data AdrSrc = 0 S1: Decode
IRWrite ALUSrcA = 01
MemWrite Mem[ALUOut] ĸ rd ALUSrcA = 00 ALUSrcB = 01
ExecuteR ALUOut ĸ rs1 op rs2 ALUSrcB =10 ALUOp = 00
ExecuteI ALUOut ĸ rs1 op imm ALUOp = 00
ResultSrc = 10
ALUWB rd ĸ ALUOut PCUpdate
BEQ ALUResult = rs1-rs2; if Zero, PC ĸ ALUOut
JAL PC ĸ ALUOut; ALUOut ĸ PC+4 op = 0000011 (lw) op = op = op =
op =
OR 0110011 0010011 1101111
1100011
op = 0100011 (sw) (R-type) (I-type ALU) (jal)
(beq)

S2: MemAdr S6: ExecuteR S8: ExecuteI S9: JAL S10: BEQ
ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 10 ALUSrcA = 01 ALUSrcA = 10
ALUSrcB = 01 ALUSrcB = 00 ALUSrcB = 01 ALUSrcB = 10 ALUSrcB = 00
ALUOp = 00 ALUOp = 10 ALUOp = 10 ALUOp = 00 ALUOp = 01
ResultSrc = 00 ResultSrc = 00
PCUpdate Branch

op = op =
0000011 0100011
(lw) (sw)

S3: MemRead S5: MemWrite S7: ALUWB


ResultSrc = 00 ResultSrc = 00 ResultSrc = 00
AdrSrc = 1 AdrSrc = 1 RegWrite
MemWrite

S4: MemWB
ResultSrc = 01
RegWrite

Digital Design & Computer Architecture Microarchitecture


Chapter 7: Microarchitecture

Multicycle
Performance
Multicycle Processor Performance
• Instructions take different number of cycles:
– 3 cycles: beq
– 4 cycles: R-type, addi, sw , jal
– 5 cycles: lw
• CPI is weighted average
• SPECINT2000 benchmark:
– 25% loads
– 10% stores
– 13% branches
– 52% R-type

Average CPI = (0.13)(3) + (0.52 + 0.10)(4) + (0.25)(5) = 4.12

Digital Design & Computer Architecture Microarchitecture


Multicycle Critical Path
CLK Potential Critical Paths:
PCWrite • Calculate PC + 4 or
• Read Memory
AdrSrc Control
MemWrite Unit
IRWrite ResultSrc1:0
ALUControl2:0
ALUSrcB1:0
6:0
op ALUSrcA1:0
14:12
funct3 ImmSrc
1:0
30
funct75 RegWrite

Zero

Zero

CLK

OldPC

CLK CLK CLK 00


CLK
01
WE 19:15 Rs1 WE3 A SrcA CLK
PCNext PC Instr A1 RD1 10
0 Adr RD
ALUResult ALUOut

ALU
EN A EN 00
1 24:20 Rs2
Instr / Data A2 RD2 00 SrcB 01

WriteData
ReadData

Memory 11:7 Rd 01 10
A3 Register
WD 4 10
WD3 File

CLK
31:7 Extend ImmExt
Data

Result

Digital Design & Computer Architecture Microarchitecture


Multicycle Processor Performance
Multicycle critical path:
• Assumptions:
• RF is faster than memory
• Writing memory is faster than reading memory

Tc_multi = tpcq + tdec + 2tmux + max(tALU , tmem) + tsetup

Digital Design & Computer Architecture Microarchitecture


Multicycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 40
Register setup tsetup 50
Multiplexer tmux 30
AND-OR gate tAND-OR 20
ALU tALU 120
Decoder (Control Unit) tdec 25
Extend unit tdec 35
Memory read tmem 200
Register file read tRFread 100
Register
Tc_multi = tpcqfile+ setup max(tALU , tmem60
tdec + 2tmux +tRFsetup ) + tsetup
= (40 + 25 + 2*30 + 200 + 50) ps = 375 ps
Digital Design & Computer Architecture Microarchitecture
Multicycle Performance Example
For a program with 100 billion instructions
executing on a multicycle RISC-V processor
– CPI = 4.12 cycles/instruction
– Clock cycle time: Tc_multi = 375 ps

Execution Time = (# instructions) × CPI × Tc


= (100 × 109)(4.12)(375 × 10-12)
= 155 seconds

This is slower than the single-cycle


. processor (75 sec.)

Digital Design & Computer Architecture Microarchitecture


Single-Cycle Processor Performance
• Single-cycle critical path:
Tc_single = tpcq_PC + tmem + max[tRFread, tdec + text + tmux] + tALU + tmem + tmux + tRFsetup

• Typically, limiting paths are:


– memory, ALU, register file
– So, Tc_single = tpcq_PC + tmem + tRFread + tALU + tmem + tmux + tRFsetup
= tpcq_PC + 2tmem+ tRFread + tALU + tmux + tRFsetup

Digital Design & Computer Architecture Microarchitecture


Single-Cycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 40
Register setup tsetup 50
Multiplexer tmux 30
AND-OR gate tAND-OR 20
ALU tALU 120
Decoder (Control Unit) tdec 25
Extend unit text 35
Memory read tmem 200
Register file read tRFread 100
Register
Tc_single file setup
= tpcq_PC + 2tmem + ttRFread
RFsetup + t + t 60+ t
ALU mux RFsetup
= (40 + 2*200 + 100 + 120 + 30 + 60) ps = 750 ps
Digital Design & Computer Architecture Microarchitecture
Single-Cycle Performance Example
Program with 100 billion instructions:

Execution Time = # instructions x CPI x TC


= (100 × 109)(1)(750 × 10-12
s)
= 75 seconds

Digital Design & Computer Architecture Microarchitecture

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy