2 - Cpe410l2
2 - Cpe410l2
: Data Path
A+B
ALU
• CISC: computer architecture that has long/ complex instructions that are
general purpose and powerful but are slower that RISC; used for general
purpose design (microprogram/microcode design)
Parallelism: Instruction level and processor level
• instruction level: parallelism is exploited within individual instructions to get
more instructions/sec out of the machine. This technique employ pipelining and
superscalar architecture
A five-stage pipeline
The state of each stage as a function of time: Nine clock cycles are illustrated
• Pipelining allows a trade-off between latency (how long it takes to
execute an instruction), and processor bandwidth (how many MIPS
the CPU has).
• Since one instruction completes every clock cycle and there are
109 /T clock cycles/second, the number of instructions executed per
second is 109 /T. For example, if T = 2nsec, 500 million instructions
are executed each seconds. To get the number of MIPS, we have to
divide the instruction execution rate by 1 million to get (109 /T)/106 =
1000/T MIPS.
• Superscalar architecture uses multiple pipelining technique such that
several instructions are executed in parallel in each processing stage
• Array Computers uses array of processors to perform the same sequence of
instructions on different sets of data.
• Vector processor: vector processor uses the concept of a vector register,
which consists of a set of conventional registers that can be loaded from
memory in a single instruction, which actually loads them from memory
serially. Then a vector addition instruction performs the pairwise addition of
the elements of two such vectors by feeding them to a pipelined adder from
the two vector registers. The result from the adder is another vector, which
can either be stored into a vector register, or used directly as an operand for
another vector operation
Processor level parallelism: multiple CPUs work together on the same problem to obtain higher gains in throughput
than instruction level parallelism.