Coa Concept
Coa Concept
SET ARCHITECTURE
Processors may include "complex" instructions in their instruction set. A single "complex"
instruction does something that may take many instructions on other computers. Such
instructions are typified by instructions that take multiple steps, control multiple functional
units, or otherwise appear on a larger scale than the bulk of simple instructions
implemented by the given processor. Some examples of "complex" instructions include:
transferring multiple registers to or from memory (especially the stack) at once
moving large blocks of memory (e.g. string copy or DMA transfer)
complicated integer and floating-point arithmetic (e.g. square root, or
transcendental functions such as logarithm, sine, cosine, etc.)
SIMD instructions, a single instruction performing an operation on many homogeneous
values in parallel, possibly in dedicated SIMD registers
performing an atomic test-and-set instruction or other read-modify-write atomic instruction
instructions that perform ALU operations with an operand from memory rather than a
register
Complex instructions are more common in CISC instruction sets than in RISC instruction
sets, but RISC instruction sets may include them as well. RISC instruction sets generally do
not include ALU operations with memory operands, or instructions to move large blocks of
memory, but most RISC instruction sets include SIMD or vector instructions that perform the
same arithmetic operation on multiple pieces of data at the same time. SIMD instructions
have the ability of manipulating large vectors and matrices in minimal time. SIMD
instructions allow easy parallelization of algorithms commonly involved in sound, image,
and video processing. Various SIMD implementations have been brought to market under
trade names such as MMX, 3DNow!, and AltiVec.
Instruction encoding
0-operand (zero-address machines), so called stack machines: All arithmetic operations take place using
the top one or two positions on the stack: push a, push b, add, pop c.
C = A+B needs four instructions. For stack machines, the terms "0-operand" and "zero-address" apply to arithmetic
instructions, but not to all instructions, as 1-operand push and pop instructions are used to access memory.
1-operand (one-address machines), so called accumulator machines, include early computers and many
small microcontrollers: most instructions specify a single right operand (that is, constant, a register, or a
memory location), with the implicit accumulator as the left operand (and the destination if there is
one): load a, add b, store c.
C = A+B needs three instructions.
2-operand — many CISC and RISC machines fall under this category:
CISC — move A to C; then add B to C.
C = A+B needs two instructions. This effectively 'stores' the result without an explicit store instruction.
CISC — Often machines are limited to one memory operand per instruction: load a,reg1; add b,reg1; store reg1,c;
This requires a load/store pair for any memory movement regardless of whether the add result is an augmentation
stored to a different place, as in C = A+B, or the same memory location: A = A+B.
C = A+B needs three instructions.
RISC — Requiring explicit memory loads, the instructions would be: load a,reg1; load b,reg2; add reg1,reg2; store
reg2,c.
C = A+B needs four instructions.
3-operand, allowing better reuse of data:[4]
CISC — It becomes either a single instruction: add a,b,c
C = A+B needs one instruction.
CISC — Or, on machines limited to two memory operands per instruction, move a,reg1; add reg1,b,c;
C = A+B needs two instructions.
RISC — arithmetic instructions use registers only, so explicit 2-operand load/store instructions are needed: load
a,reg1; load b,reg2; add reg1+reg2->reg3; store reg3,c;
C = A+B needs four instructions.
Unlike 2-operand or 1-operand, this leaves all three values a, b, and c in registers available for further reuse.
more operands—some CISC machines permit a variety of addressing modes
that allow more than 3 operands (registers or memory accesses), such as the
VAX "POLY" polynomial evaluation instruction.
Due to the large number of bits needed to encode the three registers of a 3-
operand instruction, RISC architectures that have 16-bit instructions are
invariably 2-operand designs, such as the Atmel AVR, TI MSP430, and some
versions of ARM Thumb. RISC architectures that have 32-bit instructions are
usually 3-operand designs, such as the ARM, AVR32, MIPS, Power ISA, and
SPARC architectures.
Each instruction specifies some number of operands (registers, memory
locations, or immediate values) explicitly. Some instructions give one or both
operands implicitly, such as by being stored on top of the stack or in an implicit
register. If some of the operands are given implicitly, fewer operands need be
specified in the instruction. When a "destination operand" explicitly specifies
the destination, an additional operand must be supplied. Consequently, the
number of operands encoded in an instruction may differ from the
mathematically necessary number of arguments for a logical or arithmetic
operation (the arity). Operands are either encoded in the "opcode"
representation of the instruction, or else are given as values or addresses
following the opcode.
Instruction set implementation
Any given instruction set can be implemented in a variety of ways. All ways of
implementing a particular instruction set provide the same programming model,
and all implementations of that instruction set are able to run the same
executables. The various ways of implementing an instruction set give different
tradeoffs between cost, performance, power consumption, size, etc.
When designing the microarchitecture of a processor, engineers use blocks of
"hard-wired" electronic circuitry (often designed separately) such as adders,
multiplexers, counters, registers, ALUs, etc. Some kind of
register transfer language is then often used to describe the decoding and
sequencing of each instruction of an ISA using this physical microarchitecture.
There are two basic ways to build a control unit to implement this description
(although many designs use middle ways or compromises):
Some computer designs "hardwire" the complete instruction set decoding and
sequencing (just like the rest of the microarchitecture).
Other designs employ microcode routines or tables (or both) to do this—
typically as on-chip ROMs or PLAs or both (although separate RAMs and ROMs
have been used historically). The Western Digital MCP-1600 is an older
example, using a dedicated, separate ROM for microcode.
Some designs use a combination of hardwired design and microcode for the
control unit.
Some CPU designs use a writable control store—they compile the instruction
set to a writable RAM or flash inside the CPU (such as the Rekursiv processor
and the Imsys Cjip),[11] or an FPGA (reconfigurable computing).
An ISA can also be emulated in software by an interpreter. Naturally, due to
the interpretation overhead, this is slower than directly running programs on
the emulated hardware, unless the hardware running the emulator is an
order of magnitude faster. Today, it is common practice for vendors of new
ISAs or microarchitectures to make software emulators available to software
developers before the hardware implementation is ready.
Often the details of the implementation have a strong influence on the
particular instructions selected for the instruction set. For example, many
implementations of the instruction pipeline only allow a single memory load
or memory store per instruction, leading to a load–store architecture (RISC).
For another example, some early ways of implementing the
instruction pipeline led to a delay slot.
The demands of high-speed digital signal processing have pushed in the
opposite direction—forcing instructions to be implemented in a particular
way. For example, to perform digital filters fast enough, the MAC instruction
in a typical digital signal processor (DSP) must use a kind of
Harvard architecture that can fetch an instruction and two data words
simultaneously, and it requires a single-cycle multiply–accumulate multiplier.
Thank-you