0% found this document useful (0 votes)

235 views11 pages

Flynn'S Classification: Cs6303 Computer Architecture

The document discusses Flynn's taxonomy for classifying computer architectures based on the number of instruction and data streams. It describes the four categories in Flynn's taxonomy: SISD, SIMD, MISD, and MIMD. SISD refers to a single instruction stream and single data stream, like a traditional uniprocessor. SIMD uses a single instruction stream to process multiple data streams, as in vector processors. MIMD uses multiple instruction and data streams, as in multiprocessors. The document provides details on each classification.

Uploaded by

Jeya Sheeba A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

235 views11 pages

Flynn'S Classification: Cs6303 Computer Architecture

Uploaded by

Jeya Sheeba A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

CS6303 COMPUTER ARCHITECTURE

FLYNN’S CLASSIFICATION
 In 1966, Michael Flynn proposed a classification for computer architectures based on the
number of instruction steams and data streams (Flynn’s Taxonomy).
 Flynn uses the stream concept for describing a machine's structure
 A stream simply means a sequence of items (data or instructions).
Flynn’s taxonomy:
The classification of computer architectures based on the number of instruction steams and data
streams (Flynn’s Taxonomy).

SISD:

 SISD (Singe-Instruction stream, Singe-Data stream)

 SISD corresponds to the traditional mono-processor. A single data stream is being
processed by one instruction stream.
 A uni-processor in which a single stream of instruction is generated from the
program.

where CU= Control Unit, PE= Processing Element, M= Memory

SIMD:

 SIMD (Single-Instruction stream, Multiple-Data streams)

 Each instruction is executed on a different set of data by different processors i.e
multiple processing units of the same type process on multiple-data streams.
CSE/AJS/CS6303/UNIT-IV Page 1
CS6303 COMPUTER ARCHITECTURE

 This group is dedicated to array processing machines.

 Sometimes, vector processors can also be seen as a part of this group.

where CU= Control Unit, PE= Processing Element, M= Memory

 SIMD computers operate on vectors of data. For example, a single SIMD instruction
might add 64 numbers by sending 64 data streams to 64 ALUs to form 64 sums within a
single clock cycle. The subword parallel instructions are another example of SIMD .
 Advantages:
o It amortizes the cost of the control unit over dozens of execution units.
o Another advantage is the reduced instruction bandwidth and space i.e,SIMD
needs only one copy of the code that is being simultaneously executed, while
message-passing MIMDs may need a copy in every processor, and shared
memory MIMD will need multiple instruction caches.
 SIMD works best when dealing with arrays in for loops. Hence, for parallelism to work
in SIMD, there must be structured data, which is called data-level parallelism.
 Vector
 An older more elegant interpretation of SIMD is called a vector
architecture,
 The basic philosophy of vector architecture is to collect data elements
from memory, put them in order into a large set of registers, operate on
them sequentially in registers using pipelined execution units, and then
write the results back to memory.

CSE/AJS/CS6303/UNIT-IV Page 2
CS6303 COMPUTER ARCHITECTURE

 A key feature of vector architectures is then a set of vector registers. Thus,

a vector architecture might have 32 vector registers, each with 64 64-bit
elements.
 Vector versus Scalar
Vector instructions have several important properties compared to conventional instruction
set architectures, which are called scalar architectures
 A single vector instruction specifies a great deal of work it is equivalent to executing an
entire loop. The instruction fetch and decode bandwidth needed is dramatically reduced.
 By using a vector instruction, the compiler or programmer indicates that the computation
of each result in the vector is independent of the computation of other results in the same
vector, so hardware does not have to check for data hazards within a vector instruction.
 Vector architectures and compilers have a reputation of making it much easier than when
using MIMD multiprocessors to write efficient applications when they contain data-level
parallelism.
 Hardware need only check for data hazards between two vector instructions once per
vector operand, not once for every element within the vectors. Reduced checking can
save energy as well as time.
 Vector instructions that access memory have a known access pattern. If the vector’s
elements are all adjacent, then fetching the vector from a set of heavily interleaved
memory banks works very well. Thus, the cost of the latency to main memory is seen
only once for the entire vector, rather than once for each word of the vector.
 Because an entire loop is replaced by a vector instruction whose behavior is
predetermined, control hazards that would normally arise from the loop branch are
nonexistent.
 The savings in instruction bandwidth and hazard checking plus the efficient use of
memory bandwidth give vector architectures advantages in power and energy versus
scalar architectures.
For these reasons, vector operations can be made faster than a sequence of scalar operations on
the same number of data items.

CSE/AJS/CS6303/UNIT-IV Page 3
CS6303 COMPUTER ARCHITECTURE

MISD:

 MISD (Multiple-Instruction streams, Singe-Data stream)

 Each processor executes a different sequence of instructions.
 In case of MISD computers, multiple processing units operate on one single-data
stream.
 In practice, this kind of organization has never been used.

MIMD:

 MIMD (Multiple-Instruction streams, Multiple-Data streams)

 Each processor has a separate program.
 An instruction stream is generated from each program.
 Each instruction operates on different data.
 This last machine type builds the group for the traditional multi-processors. Several
processing units operate on multiple-data streams.

CSE/AJS/CS6303/UNIT-IV Page 4
CS6303 COMPUTER ARCHITECTURE

Computer Architecture Classifications:

Processor Organizations

Single Instruction, Single Instruction, Multiple Instruction Multiple Instruction

Single Data Stream Multiple Data Stream Single Data Stream Multiple Data Stream
(SISD) (SIMD) (MISD) (MIMD)

Uniprocessor Vector Array Shared Memory Multicomputer

Processor Processor (tightly coupled) (loosely coupled)

HARDWARE MULTITHREADING
 A related concept to MIMD, especially from the programmer’s perspective, is hardware
multithreading.

 While MIMD relies on multiple processes or threads to try to keep multiple processors
busy, hardware multithreading allows multiple threads to share the functional units of a
single processor in an overlapping fashion to try to utilize the hardware resources
efficiently.

 Multithreading in higher level parallelism is called thread level parallelism because it is

logically structured as separate threads of execution.

 A thread is a separate process with its own instruction and data. A thread may represent a
process that is part of a parallel program consisting of multiple processes or it may
represent an independent program on its own. In addition, the hardware must support the
ability to change to a different thread relatively quickly.
 A thread switch should be much more efficient than a process switch, which typically
requires hundreds to thousands of processor cycles while a thread switch can be
instantaneous.
 Two Approaches to multithreading:
o Fine-grained multithreading
o Coarse-grained multithreading

CSE/AJS/CS6303/UNIT-IV Page 5
CS6303 COMPUTER ARCHITECTURE

 Fine-grained multithreading:
o It switches between threads on each instruction, resulting in interleaved execution
of multiple threads. This interleaving is oft en done in a round-robin fashion,
skipping any threads that are stalled at that clock cycle.
o To make fine-grained multithreading practical, the processor must be able to
switch threads on every clock cycle.

o One advantage of fine-grained multithreading is that it can hide the throughput

losses that arise from both short and long stalls, since instructions from other
threads can be executed when one thread stalls.

o The primary disadvantage of fine-grained multithreading is that it slows down

the execution of the individual threads, since a thread that is ready to execute
without stalls will be delayed by instructions from other threads.

 Coarse-grained multithreading:
o Coarse-grained multithreading was invented as an alternative to fine-grained
multithreading. A coarse-grained multithreading switches thread only on costly
stalls, such as last-level cache misses.

o This change relieves the need to have thread switching be extremely fast and is
much less likely to slow down the execution of an individual thread, since
instructions from other threads will only be issued when a thread encounters a
costly stall.

o Drawback:

 It is limited in its ability to overcome throughput losses, especially from

shorter stalls.

 Simultaneous multithreading:

o Simultaneous multithreading (SMT) is a variation on hardware multithreading

that uses the resources of a multiple-issue, dynamically scheduled pipelined
processor to exploit thread-level parallelism at the same time it exploits
instruction level parallelism.

o The key insight that motivates SMT is that multiple-issue processors often have
more functional unit parallelism available than most single threads can effectively
use.

o Furthermore, with register renaming and dynamic scheduling, multiple

instructions from independent threads can be issued without regard to the
dependences among them; the resolution of the dependences can be handled by
the dynamic scheduling capability.

CSE/AJS/CS6303/UNIT-IV Page 6
CS6303 COMPUTER ARCHITECTURE

o The following figure conceptually illustrates the differences in a processor’s

ability to exploit superscalar resources for the following processor configurations.

 A superscalar with no multithreading support

 A superscalar with coarse-grained multithreading
 A superscalar with fine-grained multithreading
 A superscalar with simultaneous multithreading

 Horizontal dimension represents the instruction issue capability in each clock cycle.
 Vertical dimension represents a sequence of clock cycle.
 Empty slots indicate that the corresponding issue slots are unused in that clock cycle.

 In the superscalar without hardware multithreading support, the use of issue slots is
limited by a lack of instruction-level parallelism. In addition, a major stall, such as an
instruction cache miss, can leave the entire processor idle.

 In the coarse-grained multithreaded superscalar, the long stalls are partially hidden by
switching to another thread that uses the resources of the processor.

 Although this reduces the number of completely idle clock cycles, the pipeline start-up
overhead still leads to idle cycles, and limitations to ILP means all issue slots will not be
used.

CSE/AJS/CS6303/UNIT-IV Page 7
CS6303 COMPUTER ARCHITECTURE

 In the fine-grained case, the interleaving of threads mostly eliminates idle clock cycles.
Because only a single thread issues instructions in a given clock cycle, however,
limitations in instruction-level parallelism still lead to idle slots within some clock cycles.

 In the SMT case, thread-level parallelism and instruction-level parallelism are both
exploited, with multiple threads using the issue slots in a single clock cycle.

 Ideally, the issue slot usage is limited by imbalances in the resource needs and resource
availability over multiple threads.

MULTICORE PROCESSOR
 While hardware multithreading improved the efficiency of processors at modest cost, the
big challenge of the last decade has been to deliver on the performance potential of Moore’s
Law by efficiently programming the increasing number of processors per chip.

 To simplify the task of rewriting old programs to run well on parallel hardware, the
solution was to provide a single physical address space that all processors can share, so that
programs need not concern themselves with where their data is, merely that programs may be
executed in parallel.

 In this approach, all variables of a program can be made available at any time to any
processor. The alternative is to have a separate address space per processor that requires that
sharing must be explicit.

 A shared memory multiprocessor (SMP) is one that offers the programmer a single
physical address space across all processors which is nearly always the case for multicore
chips .It is also called as shared-address multiprocessor.

 Single address space multiprocessors come in two styles.

 In the first style, the latency to a word in memory does not depend on which
processor asks for it. Such machines are called uniform memory access (UMA)
multiprocessors.

 In the second style, some memory accesses are much faster than others, depending on
which processor asks for which word. Such machines are called nonuniform memory
access (NUMA) multiprocessors.

 The programming challenges are harder for a NUMA multiprocessor than for a UMA
multiprocessor, but NUMA machines can scale to larger sizes and NUMAs can have
lower latency to nearby memory.

CSE/AJS/CS6303/UNIT-IV Page 8
CS6303 COMPUTER ARCHITECTURE

 As processors operating in parallel will normally share data, they also need to coordinate
when operating on shared data; otherwise, one processor could start working on data
before another is finished with it. This coordination is called synchronization .

 When sharing is supported with a single address space, there must be a separate
mechanism for synchronization.

 One approach uses a lock for a shared variable. Only one processor at a time can acquire
the lock, and other processors interested in shared data must wait until the original
processor unlocks the variable.

 Example
A Simple Parallel Processing Program for a Shared Address Space

 Suppose we want to sum 64,000 numbers on a shared memory multiprocessor

computer with uniform memory access time. Let’s assume we have 64 processors.

 The first step is to ensure a balanced load per processor, so we split the set of
numbers into subsets of the same size. We do not allocate the subsets to a
different memory space, since there is a single memory space for this machine;
we just give different starting addresses to each processor.
 Pn is the number that identifies the processor, between 0 and 63.
 All processors start the program by running a loop that sums their subset of
numbers:
sum[Pn] = 0;
for (i = 1000*Pn; i < 1000*(Pn+1); i += 1)
sum[Pn] += A[i]; /*sum the assigned areas*/
 The next step is to add these 64 partial sums. This step is called a reduction,
where we divide to conquer.
 Half of the processors add pairs of partial sums, and then a quarter add pairs of the
new partial sums, and so on until we have the single, final sum.
 Figure illustrates the hierarchical nature of this reduction.
 Reduction is a function that processes a data structure and returns a single value.

CSE/AJS/CS6303/UNIT-IV Page 9
CS6303 COMPUTER ARCHITECTURE

 Major MIMD styles:

o Centralized shared memory multiprocessor

o Physically distributed memory multiprocessor

 Centralized shared memory:

o A many-core processor is one in which the number of cores is large enough that
traditional multiprocessor techniques are no longer efficient largely due to issues
with congestion supplying sufficient instructions and data to the many processor.

o The key architectural property is the uniform access time to all of the memory
from all the processor.

o In a multichip version the shared cache would be omitted and the bus or
interconnection network connecting the processors to memory would run between
chips as opposed to within a single chip.

CSE/AJS/CS6303/UNIT-IV Page 10
CS6303 COMPUTER ARCHITECTURE

 Physically distributed memory multiprocessor

o Each processor shares the entire memory, although the access time to the lock
memory attached to the core’s chip will be much faster than the access time to
remote memories.

o Distributing the memory among the nodes has two major benefits:
 It is a cost effective way to scale the memory bandwidth if most of the
accesses are to the local memory in the node.
 It reduces the latency for accesses to the local memory.

CSE/AJS/CS6303/UNIT-IV Page 11

Advanced Computer Architecture - Unit 1 - WWW - Rgpvnotes.in
100% (1)
Advanced Computer Architecture - Unit 1 - WWW - Rgpvnotes.in
20 pages
Lecture 10 - SIMD Architecture
No ratings yet
Lecture 10 - SIMD Architecture
27 pages
Ca Part 4
No ratings yet
Ca Part 4
25 pages
Unit-1 ACA
No ratings yet
Unit-1 ACA
26 pages
Stored Program Concept
No ratings yet
Stored Program Concept
27 pages
Module 4 ACA Notes
No ratings yet
Module 4 ACA Notes
53 pages
L03 Architecture Memory
No ratings yet
L03 Architecture Memory
56 pages
Advance Computer Architecture2
No ratings yet
Advance Computer Architecture2
36 pages
Question BAnk OF Computer Architecture
50% (2)
Question BAnk OF Computer Architecture
6 pages
Advanced Computer Architecture: The Architecture of Parallel Computers
No ratings yet
Advanced Computer Architecture: The Architecture of Parallel Computers
44 pages
Advanced Processor Superscalarclass
50% (2)
Advanced Processor Superscalarclass
73 pages
Lecture 3 Flynn's Classical Taxonomy
No ratings yet
Lecture 3 Flynn's Classical Taxonomy
29 pages
Lecture ParallelArchTLP-DLP
No ratings yet
Lecture ParallelArchTLP-DLP
52 pages
Pdf&rendition 1
No ratings yet
Pdf&rendition 1
126 pages
A Comprehensive Survey of Various Processor Types & Latest Architectures
No ratings yet
A Comprehensive Survey of Various Processor Types & Latest Architectures
7 pages
Computer Organization: - by Rama Krishna Thelagathoti (M.Tech CSE From IIT Madras)
No ratings yet
Computer Organization: - by Rama Krishna Thelagathoti (M.Tech CSE From IIT Madras)
118 pages
Chapter2 Part 3
No ratings yet
Chapter2 Part 3
27 pages
Cs8083 Notes Mcap
No ratings yet
Cs8083 Notes Mcap
187 pages
Parallel Architectures Parallel Architectures: Ever Faster
No ratings yet
Parallel Architectures Parallel Architectures: Ever Faster
11 pages
Onur 447 Spring15 Lecture14 Simd Afterlecture
No ratings yet
Onur 447 Spring15 Lecture14 Simd Afterlecture
60 pages
Chapter - 5 Parallel Processing
No ratings yet
Chapter - 5 Parallel Processing
117 pages
ACA T1 Solutions
No ratings yet
ACA T1 Solutions
17 pages
Week 4a - Computer Architecture Fundamentals - Part 1
No ratings yet
Week 4a - Computer Architecture Fundamentals - Part 1
45 pages
17CS72 Mod 1 PPT
No ratings yet
17CS72 Mod 1 PPT
206 pages
Flynn's Taxonomy of Computer Architectures: Michael Flynn 1966 CMPS 5433 - Parallel Processing
No ratings yet
Flynn's Taxonomy of Computer Architectures: Michael Flynn 1966 CMPS 5433 - Parallel Processing
13 pages
Advanced Computer Architecture: The Architecture of Parallel Computers
No ratings yet
Advanced Computer Architecture: The Architecture of Parallel Computers
44 pages
Study of Architectural Design of VLSI: Veni Madhav Sharma, Javed Ali Mansuri, Sunil Sharma
No ratings yet
Study of Architectural Design of VLSI: Veni Madhav Sharma, Javed Ali Mansuri, Sunil Sharma
2 pages
Unit IV CA
No ratings yet
Unit IV CA
73 pages
Flynns Classification
No ratings yet
Flynns Classification
27 pages
Onur Digitaldesign 2020 Lecture19 Simd Beforelecture
No ratings yet
Onur Digitaldesign 2020 Lecture19 Simd Beforelecture
64 pages
Histroy of Computer Generation
No ratings yet
Histroy of Computer Generation
28 pages
Lec 18-VectorSIMDGPUArchitectures
No ratings yet
Lec 18-VectorSIMDGPUArchitectures
29 pages
23.L20 Multiprocessing Multithreading Vectorization
No ratings yet
23.L20 Multiprocessing Multithreading Vectorization
38 pages
Cs6303-Computer Architecture Unit-Iv Parallelism Part A: Svcet
No ratings yet
Cs6303-Computer Architecture Unit-Iv Parallelism Part A: Svcet
4 pages
StudM1p1Parallel Computer Modelsppt1shared
No ratings yet
StudM1p1Parallel Computer Modelsppt1shared
107 pages
CA Slides#2 Architectural Classification
No ratings yet
CA Slides#2 Architectural Classification
22 pages
Department of Computer Science and Engineering: Course Material (Question Bank)
No ratings yet
Department of Computer Science and Engineering: Course Material (Question Bank)
4 pages
Onur Digitaldesign 2020 Lecture20 Gpu Beforelecture
No ratings yet
Onur Digitaldesign 2020 Lecture20 Gpu Beforelecture
73 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
49 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
Cs8083 MCP Unit I Notes
No ratings yet
Cs8083 MCP Unit I Notes
31 pages
Lec2 ParallelProgrammingPlatforms
No ratings yet
Lec2 ParallelProgrammingPlatforms
26 pages
Unit 1
No ratings yet
Unit 1
48 pages
Array Processor
100% (1)
Array Processor
8 pages
COA Question Bank - CO Mapping
No ratings yet
COA Question Bank - CO Mapping
3 pages
Review of LSS CSC
No ratings yet
Review of LSS CSC
21 pages
Paralelismo 2024
No ratings yet
Paralelismo 2024
30 pages
COA U5 PPT Full
No ratings yet
COA U5 PPT Full
43 pages
Introduction Mod1
No ratings yet
Introduction Mod1
120 pages
Baker CHPT 5 SIMD Good
No ratings yet
Baker CHPT 5 SIMD Good
94 pages
CSC 429 Mid Fall 12
No ratings yet
CSC 429 Mid Fall 12
6 pages
Mcap Notes
No ratings yet
Mcap Notes
186 pages
Advanced Computer Architecture: Presented By, Krishna
No ratings yet
Advanced Computer Architecture: Presented By, Krishna
35 pages
Advanced Computer Architecture: Presented By, Farhan Mukhtiar
No ratings yet
Advanced Computer Architecture: Presented By, Farhan Mukhtiar
9 pages
Lect6-SPC - Flynns
No ratings yet
Lect6-SPC - Flynns
16 pages
Arm Developments in HPC: CIUK 2018, Manchester
No ratings yet
Arm Developments in HPC: CIUK 2018, Manchester
29 pages
COE4590 10 Flyns
No ratings yet
COE4590 10 Flyns
15 pages
Unit 4 COA
No ratings yet
Unit 4 COA
5 pages
Architecture of Parallel Computing
No ratings yet
Architecture of Parallel Computing
6 pages
Unit 6 Part1 Ilp
No ratings yet
Unit 6 Part1 Ilp
39 pages
Aca Unit 1.1
No ratings yet
Aca Unit 1.1
20 pages
Cray2 Super Computer
No ratings yet
Cray2 Super Computer
27 pages
Aca
No ratings yet
Aca
3 pages
Arithmatic Pipline Unit-3
No ratings yet
Arithmatic Pipline Unit-3
27 pages
Vector (Array) Processing and Superscalar Processors
No ratings yet
Vector (Array) Processing and Superscalar Processors
7 pages
Model
No ratings yet
Model
14 pages
CS02DOS Unit 1UNit - 1 Introduction
No ratings yet
CS02DOS Unit 1UNit - 1 Introduction
6 pages
CA Classes-221-225
No ratings yet
CA Classes-221-225
5 pages
Parallel Algorithms Underlying MPI Implementations
No ratings yet
Parallel Algorithms Underlying MPI Implementations
55 pages
Cambricon: An Instruction Set Architecture For Neural Networks
No ratings yet
Cambricon: An Instruction Set Architecture For Neural Networks
13 pages
B.E. (Computer Engineering) - 2003 Course: Elective II
No ratings yet
B.E. (Computer Engineering) - 2003 Course: Elective II
6 pages
Blending DSP and ML Features Into A Low Power General Purpose Processor
No ratings yet
Blending DSP and ML Features Into A Low Power General Purpose Processor
21 pages
IntelAVX-512 InstructionSetForPacketProcessing TechGuide 633930v2
No ratings yet
IntelAVX-512 InstructionSetForPacketProcessing TechGuide 633930v2
20 pages
Unit 5
No ratings yet
Unit 5
29 pages
An AnandTech Interview With Jim Keller - 'The Laziest Person at Tesla'
No ratings yet
An AnandTech Interview With Jim Keller - 'The Laziest Person at Tesla'
18 pages
aDSA SuperComp4Trng DNN
No ratings yet
aDSA SuperComp4Trng DNN
12 pages
MTIA First Generation Silicon Targeting Meta's Recommendation
No ratings yet
MTIA First Generation Silicon Targeting Meta's Recommendation
13 pages
Energy-Efficient RISC-V-Based Vector Processor For Cache-Aware Structurally-Pruned Transformers
No ratings yet
Energy-Efficient RISC-V-Based Vector Processor For Cache-Aware Structurally-Pruned Transformers
6 pages
Pipeline and Vector Processing
No ratings yet
Pipeline and Vector Processing
4 pages
Syllabus 15 17
No ratings yet
Syllabus 15 17
3 pages
A Study of Performance Programming of CPU, GPU Accelerated Computers and SIMD Architecture
No ratings yet
A Study of Performance Programming of CPU, GPU Accelerated Computers and SIMD Architecture
19 pages
OsChapter 6
No ratings yet
OsChapter 6
12 pages
System Architecture
No ratings yet
System Architecture
11 pages
Lesson1 - Structural Components of Microprocessors - Microcontroller
No ratings yet
Lesson1 - Structural Components of Microprocessors - Microcontroller
5 pages
General System Architecture
No ratings yet
General System Architecture
28 pages
Study Guide Designing Cisco Data Centre Infrastructure (300-610) Exam
From Everand
Study Guide Designing Cisco Data Centre Infrastructure (300-610) Exam
Anand Vemula
No ratings yet
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
From Everand
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
Mamta Devi
No ratings yet
Cortex-M Architecture and Programming Reference: Definitive Reference for Developers and Engineers
From Everand
Cortex-M Architecture and Programming Reference: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Flynn'S Classification: Cs6303 Computer Architecture

Uploaded by

Flynn'S Classification: Cs6303 Computer Architecture

Uploaded by

CS6303 COMPUTER ARCHITECTURE

 SISD (Singe-Instruction stream, Singe-Data stream)

where CU= Control Unit, PE= Processing Element, M= Memory

 SIMD (Single-Instruction stream, Multiple-Data streams)

 This group is dedicated to array processing machines.

where CU= Control Unit, PE= Processing Element, M= Memory

 A key feature of vector architectures is then a set of vector registers. Thus,

 MISD (Multiple-Instruction streams, Singe-Data stream)

 MIMD (Multiple-Instruction streams, Multiple-Data streams)

Computer Architecture Classifications:

Single Instruction, Single Instruction, Multiple Instruction Multiple Instruction

Uniprocessor Vector Array Shared Memory Multicomputer

 Multithreading in higher level parallelism is called thread level parallelism because it is

o One advantage of fine-grained multithreading is that it can hide the throughput

o The primary disadvantage of fine-grained multithreading is that it slows down

 It is limited in its ability to overcome throughput losses, especially from

o Simultaneous multithreading (SMT) is a variation on hardware multithreading

o Furthermore, with register renaming and dynamic scheduling, multiple

o The following figure conceptually illustrates the differences in a processor’s

 A superscalar with no multithreading support

 Single address space multiprocessors come in two styles.

 Suppose we want to sum 64,000 numbers on a shared memory multiprocessor

 Major MIMD styles:

o Centralized shared memory multiprocessor

 Centralized shared memory:

 Physically distributed memory multiprocessor

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.