CSC21000 Sort
CSC21000 Sort
Mohamed Hiba
April 27, 2025
Objective:
The purpose of this project was to:
Implement a Bubble Sort algorithm based on the textbook Computer Organization and
Design, 5th Edition, Section 2.13.
Measure and compare runtime performance of:
o MIPS assembly implementation (handwritten, running in MARS).
o C++ implementation compiled with:
No optimization (-O0)
Full optimization (-O3)
Generate an optimized assembly listing (.s file) from the C++ program.
Analyze the performance impact of compiler optimization and discuss the potential for
manual optimization in MIPS.
Files Submitted:
File Name Description
sort_driver.cpp
C++17 program implementing bubble sort and swap, with timing via
std::chrono.
sort_O0 Executable compiled from sort_driver.cpp with no optimization (-O0).
sort_O3 Executable compiled from sort_driver.cpp with full optimization (-O3).
sort_O3.s Optimized C++ assembly listing generated with clang++ -S -O3.
bubble_sort.asm Handwritten MIPS program for bubble sort, runnable in MARS.
Methodology:
C++ Implementation:
o sort_driver.cpp contains a direct implementation of textbook bubble sort and
swap.
o Timing was performed using std::chrono::high_resolution_clock.
o Arrays of size 10, 100, 500, and 1000 were initialized in descending order and
sorted.
o Compilation was done twice:
Without optimization (-O0).
With full optimization (-O3).
o Assembly code was generated using clang++ -S -O3.
MIPS Implementation:
o bubble_sort.asm implemented the same algorithm manually in MIPS assembly.
o Timing was measured using syscall 30 (MARS simulator’s microsecond timer).
o Output printed the first 20 elements after sorting to verify correctness.
o Arrays of size 10, 100, 500, and 1000 were sorted.
o 10,000 elements were not tested on MIPS due to impractical simulation time (>20
minutes).
std::vector<int> v(N);
for (int i = 0; i < N; ++i) v[i] = N - i;
auto t0 = std::chrono::high_resolution_clock::now();
sort(v.data(), N);
auto t1 = std::chrono::high_resolution_clock::now();
#############################################################################
##
# bubble_sort_demo.s — Textbook bubble-sort + swap, ready for MARS
# Source: Computer Organization & Design, 5e (§2.13) – adapted for I/O + timing
# Author : Mohamed Hiba
#############################################################################
##
.data
.align 2
array: .space 40000 # 10 000 words × 4 bytes
.text
.globl main
############################################################
# main — build descending array, call sort, time it, show results
############################################################
main:
# ---- read N -----------------------------------------------------------
li $v0, 4
la $a0, prompt
syscall
li $v0, 5 # read_int
syscall
move $s0, $v0 # s0 ← N
li $t3, 0 # counter
la $t4, array # ptr
print_loop:
li $t5, 20
beq $t3, $t5, print_time
lw $a0, 0($t4)
li $v0, 1 # print_int
syscall
li $v0, 4
la $a0, space
syscall
addi $t3, $t3, 1
addi $t4, $t4, 4
j print_loop
li $v0, 4
la $a0, acc_newline
syscall
############################################################
# swap(int v[], int k) — textbook leaf procedure (Fig 2-25)
############################################################
swap:
sll $t1, $a1, 2 # k*4
add $t1, $a0, $t1 # &v[k]
lw $t0, 0($t1) # temp = v[k]
lw $t2, 4($t1) # t2 = v[k+1]
sw $t2, 0($t1) # v[k] = t2
sw $t0, 4($t1) # v[k+1] = temp
jr $ra
############################################################
# sort(int v[], int n) — textbook bubble sort (Fig 2-27)
############################################################
sort:
addi $sp, $sp, -20
sw $ra, 16($sp)
sw $s3, 12($sp)
sw $s2, 8($sp)
sw $s1, 4($sp)
sw $s0, 0($sp)
Analysis:
MIPS vs C++:
o MIPS shows classic O ( n2 ) growth, with runtime rapidly increasing with N.
o C++ -O0 performance was much better than MIPS, but still significant due to lack
of optimization.
o C++ -O3 optimization provided drastic improvements, showing the power of
modern compiler optimizations.
10,000 element sort skipped on MIPS because it would require impractical simulation
time.
Observations and Takeaways
Optimization Matters:
C++ -O3 provided up to 4× faster execution compared to -O0 without modifying source
code.
Compiler Optimization:
In the C++ assembly (sort_O3.s), functions were inlined, loops were optimized, and
unnecessary instructions were eliminated.
Manual MIPS Improvement (Future Work):
Although not required for this project, inlining the swap function or unrolling loops
would further improve the MIPS timing results.
Scalability:
Bubble sort is not practical for large arrays; better sorting algorithms like quicksort would
perform significantly better.