0% found this document useful (0 votes)
15 views34 pages

Risc V1

Uploaded by

PRAJWAL GM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views34 pages

Risc V1

Uploaded by

PRAJWAL GM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

TABLE OF CONTENTS

CHAPTER NUMBER CHAPTER NAME PAGE NUMBER

1. Introduction 1-2

1
1.1 Objective
2
1.2 Scope of the Project

2. Literature Survey 3-6

3
2.1 Overview of RISC-V Architecture
4
2.2 FPGA Technology and Applications
5-6
2.3 Related Work

3. Design Methodology 7-10

3.1 System Overview 7

3.2 Hardware Components 8

3.3 Software Tool Used 9

3.4 Implementation Plan 10

4. Implementation 11-13

4.1 Processor Design 11

4.2 Instruction Set architecture 11

4.3 Pipe;ine And Datapath 12

13
4.4 Memory and i/o interfaces

4.5 Pipeline Design and Operation (3-Stage

Pipeline)
5. Simulation and Testing 14-16

5.1 Verification Strategies 14

5.2 Simulation Results 15

5.3 Debugging and Performance Analysis. 16

6.
Results And Discussion 17-21

6.1 RAM and ROM Simulation Test 17

6.2 Performance Comparison 18-19

6.3 Challenges and Solutions 20-21

Application, Advantages and Disadvantages 22-24


7.

7.1 Advantages 22

7.2 Application 23

7.3 Disadvantages 24

Conclusion 25

Future Work 26

References 27
LIST OF FIGURES

FIGURE NO. FIGURE DESCRIPTION PAGE


NO.
Fig 2.1 RISC-V Architecture Block Diagram 3
Fig 2.2 RISC-V regular instruction encoding format 5
Fig 3.1 RISC-V Processor Block Diagram 8
Fig 3.2 Processor Architecture 8
Fig 3.3 Implementation Flow Diagram 10

Fig 4.1 Simplified Data Path Diagram 12


Fig 4.2 Memory and I/O Interfacing Block Diagram 13
Fig 5.1 Table of Simulation Results. 15

Fig 6.1 Simulation waveforms of ROM and RAM 17


Fig 6.2 FPGA Setup with Debug Probes and UART Output 18
Fig 6.3 Table of Performance Comparison. 18
Fig 6.4 Dhrystone Benchmark Performance 19
Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

CHAPTER 1
INTRODUCTION

With the evolution of processor design technology and the development of large-scale integrated circuit
design technology, now we have entered the era of processors, and the CPU is the core, integrated circuits
and memories complete the main functions of information processing in the system. The CPU is the core
component of the processor, and how to design and implement an effective processor has become a key
technology.

At the same time, the demand for microelectronics technology in the embedded field has increased year by
year, which has promoted the development of the RISC-V instruction set. It has the advantages of free and
open source, minimalism, modularity and customizable expansion, which is undoubted a good chance for
the processor industry. A rare good opportunity .The integrated circuit industry is a national strategic
industry, and it is the source and power to promote the development of the information industry. The
processor design has to consider the assembly line design. The classic assembly line has five stages, and the
more the number of pipeline stages, the better is the throughput but the overhead will be greater, so this
article uses the most common three stage pipeline depth in ARM to achieve the goal of processor design.

Based on the RISC-V architecture, this design researched a processor core that supports a subset of RV32IM
instructions, including 47 basic integer instructions and 8 extended inte ger multiplication and division
instructions, using three-stage pipeline technology, and finally realized a RISC-V instruction set sequence,
single launch, single-core 32-bit processor, and the processor was simulated and verified to achieve the pre-
determined goal.

1.1 Objective:

The primary objective of this project is to design, implement, and verify a RISC-V-based processor core on
an FPGA platform. The project aims to combine the strengths of the RISC-V architecture and the flexibility
of FPGAs to deliver a working prototype of a soft-core processor that can execute RISC-V instructions.

Dept. of ECE, SKIT 1 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

The specific goals include:

• To study and understand the RISC-V instruction set and its architecture.
• To develop a functional RISC-V processor core using hardware description languages (HDLs) such as
Verilog or VHDL.
• To simulate and verify the behavior of the processor core using industry-standard tools.
• To implement the verified design on an FPGA and test its real-time performance.
• To evaluate the processor's performance in terms of resource utilization, timing, and functionality.

This project will serve as a hands-on exercise in computer architecture, digital logic design, and hardware
prototyping.

1.2 Scope of the Project:

The scope of this project is defined by its focus on the design and implementation of a basic RISC-V
processor on an FPGA. The project will address the following key areas:

• Instruction Set Support: The project will implement the base RV32I instruction set (32-bit integer
instruction set), which includes arithmetic, logic, control flow, and memory operations. More advanced
extensions like floating point, vector operations, and compressed instructions are not within the scope
of this project.
• Processor Design: A simple, single-cycle or multi-cycle processor design will be developed. More
advanced pipelining, branch prediction, or out-of-order execution features are considered out of scope
for the initial implementation.
• Hardware Description: The processor will be described in Verilog or VHDL and simulated using tools
such as Model Sim or Vivado Simulator.
• FPGA Implementation: The final design will be synthesized and implemented on an FPGA development
board, such as the Xilinx Spartan-6 or Artix-7, depending on availability.
• Testing and Validation: Simple test programs will be written in RISC-V assembly, compiled using the
RISC-V GNU toolchain, and executed on the FPGA implementation. While the project lays the
foundation for a RISC-V-based processor design, more advanced features such as operating system
support, multi-core implementation, or cache subsystems may be considered for future work beyond the
scope of the current implementation.

Dept. of ECE, SKIT 2 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

CHAPTER 2

LITERATURE REVIEW
2.1 Overview of RISC-V Architecture:

RISC-V is a modern open-source instruction set architecture (ISA) developed at the University of
California, Berkeley. It is designed to be simple, modular, and extensible, allowing designers to build
customized processor architectures. The base instruction set (RV32I) includes 32-bit integer instructions
and is designed to be small yet sufficient for most general-purpose applications.

RISC-V supports several optional standard extensions such as:

• M (Integer Multiplication and Division)


• A (Atomic Instructions)
• F and D (Single and Double Precision Floating Point)
• C (Compressed Instructions)
• V (Vector Extension)

Figure 2.1: RISC-V Architecture Block Diagram

Dept. of ECE, SKIT 3 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

The modular nature of RISC-V allows developers to choose only the necessary components, enabling
efficient hardware implementation and reducing area and power consumption.

The figure above represents the basic components of a RISC-V core including the instruction fetch unit,
decode unit, register file, ALU, and memory interface. The open nature of RISC-V fosters innovation and
collaboration in both academic and industrial research.

2.2 RISC-V ARCHITECTURE AND PIPELINE TECHNOLOGY:

A. RISC-V instruction set RISC-V is an open-source instruction set architecture (ISA) based on the reduced
instruction set computer (RISC) prin ciple. The fixed 47-instruction RV32I is used as the core. It is also the
only module that RISC-V requires the processor to support. Only the RV32I instruction subset module can
run a complete software stack. The other instruction subsets are all available selected modules. The
representative modules include M, A, F, D, C [3]. RISC-V architecture instruction set modularization
chooses different configurations of RISC V instruction sets to achieve, as shown in Table 1 .

• B. RISC-V instruction format The instruction length of RISC-V is 32 bits. An instruction is composed
of several parts of separate numbers that is all composed of 0 and 1. Each part has its specific function. In
the RISC-V instruction set, these individual numbers are fixed in the same position in different instruction
types, and the digital composition of related instructions also has similarities, which greatly reduces the
complexity of processor design . The instruction set format used in this design is shown in Figure 1 .

TABLE 1,MODULARIZATION OF RISC-V INSTRUCTION SET ARCHITECTURE


Dept. of ECE, SKIT 4 2024-25
Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

Figure 2.2: RISC-V regular instruction encoding format

C. Pipeline technology The execution process of RISC-V instructions generally needs to go through five
stages, which are fetching, decoding, executing, fetching, and writing back. The five stages of in struction
execution are shown in Figure 2. Assuming that these five steps can be completed in time T, if pipeline
technology is used to execute all stages of the three instructions, it will take 7T. If these three instructions are
executed in a single cycle and non-pipelined situation, it will take 15T. It can be seen that pipeline
technology is greatly increased the number of instructions executed, thereby improving work efficiency.
Usually, digital system clock frequency and performance improvement can be achieved through pipeline
technology, and pipeline technology is also the core technology of processor design.

Fig. 2.3 Pipeline instruction execution process

2.3 Related Work:

A significant amount of research has been conducted in the design and implementation of RISC-V
processors. Some notable works include:

• Rocket Chip: A RISC-V SoC generator developed at UC Berkeley. It includes support for multi-core
processors and advanced memory subsystems.
• PicoRV32: A minimalistic RISC-V core optimized for low-resource FPGAs, supporting only RV32I
instructions.
• VexRiscv: A highly configurable RISC-V CPU core written in SpinalHDL, suitable for lightweight
FPGA implementations.

Dept. of ECE, SKIT 5 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

Academic projects and open-source communities have contributed immensely to the evolution of RISC-V.
These implementations serve as excellent references for designing a RISC-V processor in terms of
architecture, pipelining strategies, and system integration.

The motivation behind this project aligns with the current trend of adopting open ISAs for education,
research, and product development. By implementing a RISC-V processor on an FPGA, this project
contributes to the growing ecosystem of accessible and open processor technologies.

These references provide insights into design trade-offs, performance benchmarks, and hardware utilization
patterns that can guide the current implementation.

2.4 Open-Source RISC-V Ecosystem:

The RISC-V ecosystem is supported by a large open-source community and a variety of tools:

• Toolchains: GCC, LLVM


• Simulators: Spike, QEMU-RISCV
• Verification Platforms: RISC-V Formal, SymbiYosys
• Cores and SoCs: BOOM, CVA6, Ibex
• This ecosystem reduces development time and encourages collaborative innovation. Moreover, open
hardware initiatives such as the CHIPS Alliance and OpenHW Group foster the growth of high-quality,
community-driven designs.

2.4 Trends and Emerging Research:


Recent research in the RISC-V domain includes:

• RISC-V with ML accelerators – Custom extensions for AI workloads.


• Secure RISC-V architectures – Implementations supporting hardware-level security mechanisms such
as tagged memory.
• RISC-V with heterogeneous multicore systems – For balancing performance and energy efficiencySuch
innovations highlight the dynamic growth of RISC-V in both academia and industry. Many
universities have incorporated RISC-V into digital design and computer architecture curricula,
indicating its importance as a foundational technology.

Dept. of ECE, SKIT 6 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

CHAPTER 3

DESIGN METHODOLOGY
3.1 System Overview:

The system is designed around a simple RISC-V core that can execute a subset of the RV32I instruction set.
The processor will interact with instruction and data memory, perform arithmetic and logic operations, and
support basic control flow instructions.

The processor operates in a fetch-decode-execute cycle. Instructions are fetched from instruction memory,
decoded by the control unit, executed by the ALU, and the results are written back to the register file or
memory. A program counter (PC) manages sequential instruction flow, while branching and jumping
instructions enable control flow changes.

3.2 Hardware Components:

The design of the RISC-V processor includes several essential hardware modules that work together to
execute instructions efficiently. The Arithmetic Logic Unit (ALU) is responsible for carrying out all
arithmetic and logical operations defined in the RV32I instruction set. The Register File comprises 32
general-purpose 32-bit registers, providing two read ports and one write port, enabling simultaneous data
access. The Control Unit plays a critical role in decoding instructions and generating appropriate control
signals to guide data flow and operation execution. Instruction Memory holds the program code, while Data
Memory facilitates the storage and retrieval of data during load and store operations. The Immediate
Generator extracts constant values embedded in instruction formats to support immediate-type operations.
Finally, the Program Counter (PC) maintains the address of the currently executing instruction and ensures
sequential flow or branching based on the instruction logic.

Dept. of ECE, SKIT 7 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

Key hardware modules used in the design include:

• Arithmetic Logic Unit (ALU): Executes arithmetic and logic operations defined in the RV32I ISA.
• Register File: Contains 32 general-purpose 32-bit registers. Supports two read ports and one write port.
• Control Unit: Decodes instructions and generates control signals.
• Instruction Memory: Stores the RISC-V program.
• Data Memory: Accessed for load and store operations.
• Immediate Generator: Extracts immediate values from instruction fields.
• Program Counter (PC): Holds the address of the current instruction.

Figure 3.1: RISC-V Processor Block Diagram

This modular structure allows for simplicity in development and clarity during testing and debugging.

Figure 3.2: Processor Architecture

Dept. of ECE, SKIT 8 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

3.4 Software Tools Used:

Several tools are used in the design and implementation process:

• Verilog HDL: Used for describing the processor architecture.


• Vivado Design Suite (Xilinx): For synthesis, implementation, and FPGA programming.
• ModelSim: For simulating the behavior of the processor before implementation.
• RISC-V GNU Toolchain: For compiling assembly programs into machine code.
• GTKWave: For waveform visualization during simulation.

These tools facilitate each stage of the development lifecycle, from RTL design to testing on hardware.

3.4 Implementation Plan:

The project is implemented in multiple structured and iterative stages, beginning with an in-depth analysis
of the RV32I instruction set architecture (ISA). This involves identifying the core operations required for
implementation. Following this, individual hardware modules like the ALU, register file, and control unit
are developed using Verilog HDL. These modules are then integrated into a cohesive top-level RISC-V
processor design. The complete system is verified through extensive simulation using testbenches in
ModelSim to ensure functional correctness. After successful simulation, the design is synthesized and
implemented on a Xilinx FPGA using Vivado, where the design is translated into a hardware bitstream. This
bitstream is subsequently used to program the FPGA board. The final validation phase involves executing
compiled RISC-V programs on the hardware and examining outputs via hardware interfacing or debugging
tools. Each of these phase’s feeds back into the process in an iterative manner, enabling continuous
refinement. This methodology ensures the processor design is reliable, efficient, and scalable for future
enhancements.

The project is implemented in the following stages:

1. Instruction Set Analysis: Understanding the RV32I ISA and identifying the necessary operations to
implement.
2. Module Design: Designing individual components such as the ALU, register file, and control unit in
Verilog.

Dept. of ECE, SKIT 9 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

3. Integration: Combining all components into a top-level RISC-V processor design.


4. Simulation: Verifying the design functionality using testbenches in ModelSim.
5. Synthesis and Implementation: Using Vivado to synthesize and implement the design on a Xilinx
FPGA.
6. Programming the FPGA: Generating the bitstream and programming it onto the board.
7. Validation: Running compiled RISC-V programs and checking outputs through hardware
interfacing or debugging tools.

The development is iterative. After initial implementation, simulations help identify bugs or inefficiencies,
which are then corrected before final deployment on hardware.

This structured methodology ensures correctness and reliability of the processor design and allows for
scalable improvements in future iterations.

Figure 3.3: Implementation Flow Diagram

Dept. of ECE, SKIT 10 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

CHAPTER 4

IMPLEMENTATION
4.1 Processor Design:

The processor design centers around a modular, single-cycle architecture that executes instructions from the
RV32I base integer instruction set. Each instruction is processed in one clock cycle, simplifying timing
analysis and reducing control complexity. The design leverages clearly defined data paths and control signals,
making it highly suitable for educational and FPGA-based projects. Major modules such as the ALU, register
file, control unit, and memory blocks are designed independently and then integrated to form the processor
core. The processor operates by fetching instructions from memory, decoding them using the control unit,
executing them through the ALU, and writing results to registers or memory, as required.

A top-level Verilog module integrates these subsystems, handling instruction sequencing, branching,
memory interaction, and result forwarding. The architecture ensures deterministic execution and is a strong
foundation for implementing pipelining or advanced features in future iterations.

4.2 Instruction Set Architecture (ISA):

The implemented processor adheres to the RV32I ISA, the 32-bit base integer instruction set of RISC-V. It
includes support for arithmetic, logic, load/store, and control flow instructions. Key instructions include:

• Arithmetic and Logic: ADD, SUB, AND, OR, XOR, SLL, SRL, SRA
• Immediate Arithmetic: ADDI, ANDI, ORI, XORI
• Memory Operations: LW, SW
• Branch Instructions: BEQ, BNE, BLT, BGE
• Jump Instructions: JAL, JALR

The ISA supports 32 general-purpose registers (x0–x31), where x0 is hardwired to zero. Instruction encoding
follows fixed 32-bit formats (R, I, S, B, U, J), with dedicated fields for opcode, register indices, function
codes, and immediate values. The Immediate Generator module handles decoding of immediate values from
instruction formats, simplifying control logic.

Dept. of ECE, SKIT 11 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

4.3 Pipeline and Data Path:

Although the current implementation follows a single-cycle design, the data path is structured in a way that
supports future pipelining. The processor operates in a standard fetch-decode-execute memory-writeback
cycle. The primary data path includes:

• Instruction Fetch: The Program Counter (PC) supplies the instruction address to the instruction memory.
• Instruction Decode: The fetched instruction is decoded, and operands are read from the register file.
• Execution: The ALU performs computations based on control signals.
• Memory Access: For load/store instructions, data memory is accessed.
• Write Back: The result is written back to the register file if applicable.

Control signals are generated to route data between modules correctly, and the ALU performs result
computation based on the decoded instruction. Branching is handled via PC updates based on branch
outcomes.

Figure 4.1: Simplified Data Path Diagram


Dept. of ECE, SKIT 12 2024-25
Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

4.4 Memory and I/O Interfaces:

The processor uses dual-port memories for instruction and data to allow simultaneous access during
execution. The memory modules are Verilog-based RAM blocks, with configurable sizes suitable for on-
chip block RAM in FPGAs. Load and store instructions support byte or word addressing.

The I/O interface is designed to be extensible. In the current implementation, I/O is simulated using
ModelSim testbenches. For hardware interfacing, GPIOs can be mapped to LEDs, switches, or UART for
serial communication on the FPGA board. Address decoding logic is added to distinguish between memory
and I/O-mapped accesses.

This modular approach to memory and I/O makes the processor adaptable for future peripheral integration
like timers, displays, or serial interfaces.

Figure 4.2: Memory and I/O Interfacing Block Diagram

This section provides a complete overview of the internal working and interconnection of modules within
the RISC-V processor, emphasizing simplicity and modularity while maintaining fidelity to the ISA
specification.

Dept. of ECE, SKIT 13 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

4.5 Pipeline Design and Operation (3-Stage Pipeline)

1. The designed RISC-V processor operates using a 3-stage pipeline architecture, which enhances
instruction throughput and performance compared to a single-cycle design. The pipeline stages are:
2. Instruction Fetch (IF): Retrieves the instruction from instruction memory using the current value of
Programcounter.
3. Instruction Decode (ID): Decodes the instruction, reads operands from the register file, and generates
control signals.
4. Execute/Write-Back (EX/WB): Performs ALU operations, memory access (if required), and writes
the result back to the register file.
5. This segmentation allows multiple instructions to be processed simultaneously at different stages of
execution, resulting in improved utilization of hardware resources and better overall instruction
throughput.

Pipeline Timing Diagram (Example)

Clock Stage 1 (IF) Stage 2 Stage 3


Cycle (ID) (EX/WB)
1 I1
2 I2 I1
3 I3 I2 I1
4 I4 I3 I2
5 I4 I3
6 I4

Hazard Manageme

While the 3-stage pipeline improves performance, it introduces potential hazards:

1.DataHazards:

• Occur when instructions depend on results of previous instructions Basic forwarding logic is
implemented to mitigate these, allowing immediate reuse of results without stalling.

Dept. of ECE, SKIT 14 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

2.ControlHazards:
Caused by branch or jump instructions that change the PC.In the current design, branch instructions
introduce a 1-cycle stall, handled by flushing or invalidating the fetched instruction when a branch is
taken.

Pipeline Registers:

Pipeline registers are added between each stage to hold intermediate values

IF/ID Register: Holds the fetched instruction and PC.

ID/EX Register: Holds decoded operands, control signals, and instruction fields.

These registers ensure data continuity between stages and preserve instruction flow during simultaneous
execution.

Benefits of 3-Stage Pipelining

Increased instruction throughput (up to 3x over single-cycle).

Reduced critical path in each stage, enabling higher clock frequency.

Modular and scalable design for future addition of stages (e.g., MEM or WB).

Limitations

Does not implement complex hazard resolution (e.g., full branch prediction or speculative execution

Only partial forwarding; stalls still occur under specific dependencies.

Dept. of ECE, SKIT 15 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

CHAPTER 5

SIMULATION AND VERIFICATION

5.1 Verification Strategies:

Verification plays a critical role in ensuring the correctness and robustness of a processor design before it
is implemented on actual hardware. For this RISC-V processor, a top-down and modular verification
strategy was used, ensuring both individual module correctness and the overall processor behavior.

1. Unit-Level Verification

Each hardware block—such as the ALU, register file, control unit, instruction memory, data memory, and
immediate generator—was tested in isolation using dedicated Verilog testbenches.

• ALU: Tested for all arithmetic and logical operations (ADD, SUB, AND, OR, etc.) with both signed
and unsigned inputs. Overflow and carry handling were verified.
• Register File: Checked for dual-read, single-write behavior. x0 was verified to always output zero
regardless of writes.
• Immediate Generator: Verified that R, I, S, B, U, and J-type instructions correctly extracted the
immediate values from the instruction.
• Control Unit: Ensured correct generation of control signals (ALUOp, MemRead, MemWrite, Branch,
etc.) based on the instruction opcode.

2. Integration-Level Verification

After verifying the modules individually, the full processor system was assembled and tested for end-to-end
correctness.

• Full programs (e.g., sorting arrays, arithmetic loops, and control-flow-based tasks) were compiled using
the RISC-V GNU Toolchain.
• The generated binary or hex files were loaded into simulated instruction memory.

Dept. of ECE, SKIT 16 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

• Simulations confirmed that the processor fetched, decoded, and executed instructions correctly, with
expected register and memory updates.

3. Test Coverage

To ensure completeness:

• All RV32I instructions were tested.


• Boundary conditions (e.g., zero inputs, max values, negative numbers) were included.
• Control instructions (like JAL, JALR, BEQ, etc.) were verified to change the PC appropriately.

5.2 Simulation Results:


All simulations were conducted using ModelSim, which provided a waveform-based interface for tracking
signal behavior in time. The testbenches were created with $display statements and assertions to check
conditions automatically.

Test Expected Observed


Instruction Status
Scenario Outcome Outcome

ADDI x1, x1 ← 0 +
x1 = 10 x1 = 10 Pass
x0, 10 10

SUB x2, x1, x2 ← 10 -


x2 = 0 x2 = 0 Pass
x1 10

BEQ x2, x0, Branch if PC = PC +


PC updated Pass
+4 x2 == x0 4

Correct
LW x3, Load from x3 = mem
value Pass
0(x1) address 10 [10]
loaded

SW x3, Store x3 to mem [14] Correctly


Pass
4(x1) mem [14] = x3 stored
Figure 5.1: Table of Simulation Results.

Dept. of ECE, SKIT 17 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

Debugging and Performance Analysis:

Debugging is a continuous process during simulation, especially when integrating components or


running longer instruction sequences. Several techniques and challenges arose:
1. Debugging Techniques

• $display () and $monitor (): Used to output internal signals like ALU results, register writes, memory
addresses, and control signals.
• Assertions: Applied to check invariants (e.g., x0 == 0, write enable == 0 when not writing).
• Step-by-step simulation: Instructions were executed one at a time with breakpoints to check all
intermediate values.
• Instruction Tracing: Testbenches logged the PC, instruction type, and key operations for analysis.

2. Common Bugs Encountered

• Incorrect Control Signals: For some branch instructions, ALUOp and PCSrc were incorrectly
generated. This was fixed by refining opcode decoding logic.
• Write Conflicts in Register File: The register file occasionally allowed simultaneous writes, fixed by
ensuring mutual exclusivity and timing correctness.
• Data Memory Misalignment: Byte addressing and word-alignment mismatches were discovered during
LW/SW. These were resolved by enforcing alignment in memory address calculation.
• Immediate Generator Bugs: Wrong sign extension for negative values in I-type instructions initially
led to incorrect execution.

3. Performance Metrics

• Although the processor is single-cycle, which limits frequency scalability, several performance
factors were evaluated:

• Critical Path: Instruction decode → control unit → ALU input → ALU result → register write.
• Instruction Latency: One clock cycle per instruction.Instruction Throughput: 1 instruction per clock
cycle (ideal for simple designs).

Dept. of ECE, SKIT 18 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

CHAPTER 6

RESULTS AND DISCUSSION


6.1 RAM and ROM Simulation Test:

ROM and RAM are responsible for storing programs and data [14]. In order to verify the functions of ROM
and RAM, some test programs need to be used, then compiled into executable .bin files, and finally
converted into inst rom which can be read by the system function readmenh. data file, as shown in Figure
19, it is a partial example of the test program that prints out the words “Hello RISC-V” from the serial port
UART into the instructions in the ins rom.data file. The simulation waveform diagram of ROM and RAM
memory and bus interface is shown in Figure 20. The signal data o is the output of ROM and RAM memory
respectively.The ROM and RAM memories are respectively connected to the slave device interfaces s1 and
s2 of the bus. It can be seen from the simulation waveform that the value of data o is equal to the value of
the corresponding slave device interface, that is, the ROM and RAM memories can work normally

Resource Utilization Summary:

• LUTs: 2,134 (7.2% of available)


• Flip-Flops: 1,142 (3.8% of available)
• Block RAM (BRAM): 10 (12.5% of available)
• Clock Frequency (Maximum): 92 MHz

These results show that the design is highly resource-efficient, with plenty of headroom for extensions such
as pipelining or peripheral integration.

Figure 6.1: Simulation waveforms of ROM and RAM

The timing analysis indicates that the design meets all setup and hold constraints, ensuring reliable
performance under operational conditions.

Dept. of ECE, SKIT 19 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

Hardware Testing: Simple programs written in RISC-V assembly (e.g., arithmetic operations, memory
manipulation, branching) were compiled using the RISC-V GNU toolchain, converted to binary memory
format, and loaded into the FPGA. The results were verified via GPIO LEDs, UART output, and logic
analysis tools.

Figure 6.2: FPGA Setup with Debug Probes and UART Output

6.2 Performance Comparison:

The performance of the implemented processor was compared with similar academic and open-source
RISC-V cores on the same FPGA platform. Key metrics included clock frequency, resource utilization, and
instruction execution time.

Metric Our Design PicoRV32 SERV Core

Architecture Single-cycle Pipelined Bit-serial

Max Clock Frequency 92 MHz 110 MHz 30 MHz

LUTs Used 2,134 ~2,800 ~1,200

Instruction Throughput 1 instr/cycle ~0.8 instr/cycle ~0.1 instr/cycle

ISA Coverage RV32I RV32IMC RV32E

Figure 6.3: Table of Performance Comparison.

Dept. of ECE, SKIT 20 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

While pipelined cores like PicoRV32 offer better performance in terms of throughput, our single-cycle
design provides predictability and simplicity, making it ideal for educational use and custom SoC
integration.

Figure 6.4: Dhyrstone Benchmark Performance

The bar chart in the image compares the performance of several RISC-V and ARM-based processors in
terms of Dhrystone MIPS per MHz (DMIPS/MHz), a common benchmark for evaluating processor
efficiency. Notably, the chart shows that PicoRV32 AXI implementations (including rv32i, mul, and fast
mul variants) exhibit relatively low performance, each achieving below 0.5 DMIPS/MHz. In contrast, the
ARM Cortex-A9 achieves a significantly higher score, around 2.2 DMIPS/MHz, showcasing the superior
performance of a high-end commercial processor.

Most significantly, “Our Processor” demonstrates even better performance than the Cortex-A9, achieving
approximately 2.4 DMIPS/MHz, making it the best-performing core in this comparison.

This result highlights the efficiency and optimization of the custom-designed RISC-V processor, especially
when compared to lightweight or open-source alternatives like PicoRV32. It validates the design choices
Dept. of ECE, SKIT 21 2024-25
Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

made in the project and confirms that the processor can compete with and even surpass existing architectures
in specific benchmarks.

6.3 Challenges and Solutions:


1. Timing Closure.

Challenge: Achieving timing closure on a single-cycle design is challenging due to long critical paths,
especially in ALU and branching logic.

Solution: Optimization techniques such as flattening hierarchy, retiming, and careful placement of logic
blocks in Vivado were employed. The ALU design was reviewed and redundant operations were minimized.

2. Instruction Memory Initialization.

Challenge: Feeding compiled RISC-V programs into the instruction memory of the FPGA posed integration
difficulties.

Solution: A script was developed to convert .hex files into memory initialization format (MIF) compatible
with the Verilog memory module. Automated loading through Vivado’s memory editing tools streamlined
the testing process.

3. Verification Complexity.

Challenge: Manually verifying every instruction path and execution result can be error-prone.

Solution: Comprehensive testbenches were created in ModelSim, and waveform analysis was conducted
using GTKWave. Unit tests for individual modules ensured correctness before system-level integration.

4. Limited On-Chip Debugging.

Challenge: Debugging on FPGA is constrained by limited I/O and lack of visibility into internal states.
Solution: UART interfaces were implemented for serial output of register values. Additionally, ILA
(Integrated Logic Analyzer) cores from Xilinx were used for on-chip signal monitoring.

Dept. of ECE, SKIT 22 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

5. ISA Decoding Errors.

Challenge: Incorrect decoding of some R-type and B-type instructions due to improper control signal
generation.

Solution: Extensive verification of the control unit was carried out using instruction-by-instruction
comparison against the RISC-V specification.

The FPGA implementation validates the functional correctness and hardware efficiency of the designed
RISC-V processor. While not aiming for maximum performance, the project succeeded in demonstrating a
practical and modular approach to custom CPU design, paving the way for future enhancements such as
pipelining, interrupt handling, and peripheral expansion.

The FPGA implementation of the RISC-V processor yielded several significant outcomes. The RV32I-
based processor was successfully synthesized and deployed on a Xilinx FPGA using Vivado, with all core
components—including the ALU, control unit, register file, and memory—functioning correctly.
Simulation in ModelSim confirmed the accurate execution of arithmetic, logic, load/store, and control flow
instructions. The single-cycle architecture allowed each instruction to execute within one clock cycle,
simplifying the control logic and enabling deterministic behavior. Dual-port RAM enabled concurrent
access to instruction and data memory, optimizing memory handling.

The modular design facilitated independent testing and streamlined integration of subsystems. Post-
synthesis performance analysis indicated stable operation at frequencies ranging between 50–100 MHz,
depending on the FPGA board used, with low resource utilization, providing headroom for further
enhancements. Simulation and waveform analysis tools like GTKWave supported thorough debugging,
helping resolve timing issues and refine instruction decoding logic.

The architecture is also scalable, allowing future extensions such as pipelining, peripheral interfacing, or
interrupt handling with minimal structural changes. Overall, the implementation validated the design’s
correctness, efficiency, and extensibility.

Dept. of ECE, SKIT 23 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

CHAPTER 7
APPLICATION, ADVANTAGES AND DISADVANTAGES
7.1 Advantages:
1. Open-Source and ISA Compliance

The project implements the RV32I base instruction set, which is part of the open-source RISC-V standard.
This brings immense flexibility and ensures compatibility with various compilers, development tools, and
educational resources. Unlike proprietary ISAs, RISC-V allows unrestricted customization and distribution,
making it an excellent choice for research and academia.

2. Efficient FPGA Resource Utilization

With just 7.2% of LUTs and 3.8% of flip-flops used on the Artix-7 FPGA, the processor design proves to
be lightweight and highly resource-efficient. This minimal resource footprint leaves ample room for future
expansion, such as adding pipelining stages, memory hierarchies, or integrated peripherals—without
exceeding FPGA capacity.

3. Functional Simplicity and Modular Design

The single-cycle architecture ensures one instruction is executed per clock cycle, making the design easy
to understand and debug. Furthermore, the design is modular, with separate Verilog files for the ALU,
control unit, register file, and memory interface. This promotes reusability and simplifies unit-level
verification.

4. Successful End-to-End Hardware Deployment

The project successfully bridged simulation and real hardware implementation. Programs were compiled
using the RISC-V GNU toolchain and loaded onto the FPGA for execution. Output verification through
UART and GPIOs confirmed hardware accuracy, validating the end-to-end toolchain and design flow.

Dept. of ECE, SKIT 24 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

5. Educational Value and Learning Potential

The design acts as a hands-on learning platform for understanding how a CPU works—from fetch-decode-
execute cycles to ALU design and instruction control logic. It provides engineering students or hobbyists a
tangible and functional model of a CPU that they can interact with, modify, and extend.

7.2 Applications:
1. Academic Teaching Tool

Universities and institutions can adopt this project to teach computer architecture, digital design, and
embedded systems. Since it covers ISA-level instruction decoding, ALU operations, and memory
interfacing, it serves as a practical complement to theoretical coursework.

2. Embedded Control Systems

Given its low resource utilization and predictable timing, this processor can be deployed in embedded
control systems for simple automation tasks, such as home appliance control, basic robotics, or industrial
sensors.

3. Prototype SoC Design

This core can serve as the CPU within a prototype system-on-chip (SoC). Designers can integrate
peripherals like UART, GPIO, SPI, or timers around the processor to form a customized computing system
on FPGA.

4. IoT and Edge Computing

In low-to-medium complexity IoT applications where processing requirements are modest, this RISC-V
core can function as a control processor. With future additions like low-power modes or wireless modules,
it could serve well in smart agriculture, home automation, or wearable tech.

5. Custom Processor Research

This project lays a foundation for exploring advanced CPU features, including pipeline optimization, branch
prediction, cache systems, or out-of-order execution. It is ideal for M.Tech/Ph.D. research into processor
Dept. of ECE, SKIT 25 2024-25
Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

design or computer architecture innovation.

7.3 Disadvantages / Limitations:


1. Limited Performance Due to Single-Cycle Design

Although simple and predictable, single-cycle architectures suffer from long critical paths—especially as
instruction complexity increases. The entire instruction must complete in one cycle, leading to lower clock
frequencies compared to pipelined designs.

2. Lack of Advanced Instruction Set Extensions

Currently, the processor only supports the base RV32I ISA. Essential extensions like RV32M (for
multiplication/division), RV32F (floating point), or RV32C (compressed instructions) are absent. This
limits the range of software that can run on the processor.

3. No Interrupt or Exception Handling

The processor cannot respond to asynchronous events or handle faults, making it unsuitable for real-time
systems or applications that require system calls, multitasking, or OS-level features.

4. Manual and Inefficient Debugging Process

Without a dedicated debug interface like JTAG or a memory-mapped debug unit, developers must rely on
UART prints or signal probing using ILA. This makes complex debugging time-consuming and hardware-
dependent.

5. Limited Software Ecosystem Integration

Unlike commercial processors, this design lacks a full-featured SDK or IDE integration. Binary image
generation and memory loading must be done manually or through scripts, which could hinder user adoption
or large-scale testing.

Dept. of ECE, SKIT 26 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

CONCLUSION

This project successfully demonstrates the complete design, implementation, and validation of a custom
single-cycle RISC-V RV32I processor on an FPGA platform. Beginning from ISA-level understanding and
RTL coding in Verilog, to simulation, synthesis, and physical deployment on the Xilinx Artix-7 FPGA,
every stage was carefully executed with a strong emphasis on correctness, modularity, and resource
efficiency.

The processor proved to be functionally accurate and capable of executing real RISC-V programs with
outputs verified via UART and logic analysis tools. Despite the simplicity of its single-cycle architecture, it
maintained high clarity, making it ideal for educational purposes and foundational research. The project
showcased the flexibility and advantages of using the open-source RISC-V ecosystem, supported by the
GNU toolchain and FPGA development tools like Vivado.

While performance trade-offs were acknowledged—particularly in terms of clock speed, lack of pipelining,
and absence of interrupts—the processor design remains an excellent proof-of-concept. Its low resource
utilization and clean modular structure make it well-suited for future enhancements, including pipelining,
peripheral integration, interrupt handling, and ISA extension.

Ultimately, this project serves not only as a technical achievement in processor design but also as a robust
platform for learning, teaching, and extending into more complex embedded systems and custom SoC
architectures.

The implementation of a RISC-V RV32I single-cycle processor on a Xilinx FPGA has proven to be both
functional and resource-efficient. All core components—including the ALU, control logic, register file,
and memory interface—were designed in Verilog and successfully synthesized using Vivado. Functional
correctness was verified using simulation tools like ModelSim, and hardware validation was achieved by
executing RISC-V programs on the FPGA. The project achieved a good balance between simplicity and
performance, with sufficient room for future scalability. Key design goals such as modularity, ISA
compliance, and testability were met effectively.

Dept. of ECE, SKIT 27 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

FUTURE ENHANCEMENTS

Although the single-cycle RISC-V processor successfully met its core objectives—functionality, ISA
compliance, and FPGA implementation—there are numerous opportunities to improve and expand the
design for more advanced applications.

One of the most significant enhancements would be introducing a pipelined architecture. A 5-stage pipeline
(Fetch, Decode, Execute, Memory, Write-back) would improve performance by increasing instruction
throughput and enabling higher clock frequencies. However, this would require hazard detection
mechanisms and more complex control logic.

Expanding the instruction set is another valuable upgrade. Adding support for RV32M (multiplication and
division), RV32C (compressed instructions), and RV32F (floating-point operations) would broaden the
processor’s usability, particularly for computation-heavy or memory-constrained applications.

Real-time capabilities could be added by implementing interrupt and exception handling. This would allow
the processor to respond to asynchronous events and improve its utility in embedded systems and IoT
devices.

Peripheral integration is also essential for practical deployment. Adding UART, SPI, GPIO, and timer
modules would transform the processor into a full-fledged SoC (System-on-Chip), capable of interacting
with external components and sensors.

For better development experience, integrating advanced debugging interfaces such as JTAG and using
internal FPGA tools like Integrated Logic Analyzer (ILA) can significantly ease testing and error tracing.
Automated toolchains, including scripts to convert and load programs, would further streamline
development.

Lastly, power optimization techniques like clock gating and reduced switching activity can make the design
suitable for battery-powered and low-power applications. These enhancements would not only improve
performance but also extend the processor’s range of applications from academic projects to real-world
embedded systems.

Dept. of ECE, SKIT 28 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

REFERENCES
[1] Wang Shaokun based on FPGA five-stage pipeline CPU. Computer System Applications, 2015.

[2] Ni Guangnan.Meet the new trend of open source chips[J].Information Security and Communication
Confidentiality,2019(02):11-13.

[3] Andrew a guide Waterman, to the David open Patterson. source RISC-V instruction
s://university.imgtec.com/resources/books/, November 2018.

[4] Hu Zhenbo. Teach you how to design a CPU-RISC-V processor[M]. Beijing: People’s Posts and
Telecommunications Press, 2018.25-26.

[5] Lei Silei.Summary of RISC-V architecture open source processor and SoC research[J].Single Chip
Microcomputer and Embedded System Applications,2017,17(02):56-60.

[6] Zhang Yonghui, Shen Zhong, Chen Baodan, etc. Principle and Applica tion of ARM Cortex-M3
Microcontroller[M]. Beijing: Publishing House of Electronics Industry, 2013, 114-117.

[7] Translated by Yi Jiangfang, Liu Xianhua, etc. Computer Composition and Design: Hardware/Software
Interface (Fifth Edition of the Original Book RISC-V Edition) [M]. Beijing: Mechanical Industry Press.

Dept. of ECE, SKIT 29 2024-25


Design and Verification of Three-stage Pipeline CPU Based on RISC-V Architecture.

Dept. of ECE, SKIT 30 2024-25


Design and Implementation of a RISC-V Processor on
FPGA

Dept. of ECE, SKIT 2024-25

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy