0% found this document useful (0 votes)

52 views26 pages

CH03

The document outlines the course Advanced Computer Architecture. It covers topics like parallel processing concepts, instruction level parallelism processors, pipelined processors, VLIW processors, superscalar processors, parallel computing and cache coherence, parallel programs, shared memory multiprocessors, and recent architectural trends in multi-core systems. The course introduces basic parallel techniques like pipelining and replication used in various parallel computer architectures.

Uploaded by

mehreen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views26 pages

CH03

Uploaded by

mehreen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

Advanced Computer Architecture

Dr. Saima Farhan

Fall’ 2018 Semester

Course Outline
Parallel processing:
Basic concepts, Types and levels of parallelism, Classification of parallel architectures, Basic
parallel techniques

ILP processors:
Evolution, Dependencies, Scheduling, Preservation, Speed-up

Pipelined processors:
Basic concepts, Design space of pipelines, Overview of pipelined instruction processing

VLIW processors:
Architectures, Basic Principles

Superscalar Processors:
Introduction, Parallel decoding, Superscalar instruction issue, Register renaming, Parallel
execution, Preserving sequential consistency of instruction execution

Processing of control transfer instructions:

Introduction, Basic approaches to branch handling, Delayed branching and branch processing,
Multiway branching

Parallel computing and Cache coherence:

Why Parallel Architecture, Convergence of Parallel Architectures, Fundamental Design issues
Course Outline

Parallel Programs:
The Parallelization Process, Parallelization of an Example program

Shared memory Multiprocessor:

Cache Coherence, Memory consistency, Design Space for Snooping Protocols,
Synchronization

System Interconnect Architectures:

Network properties and routing, Static connection networks and dynamic connection networks,
Multiprocessor system interconnect

Data Parallel architecture:

Introduction, Connectivity

SIMD architectures:
Fine grained SIMD, Course grained architectures , Multithreaded architectures: Computational
models, Data flow architectures

Recent architectural trends:

Multi-core system organization, Multi core memory issues
Introduction to Parallel Processing

• Basic concepts
• Types and levels of parallelism
• Classification of parallel architecture
• Basic parallel techniques

CH03
Basic concepts

• The concept of program

 ordered set of instructions (programmer’s view)

 executable file (operating system’s view)
The concept of process

• OS view, process relates to execution

• Process creation
 setting up the process description
 allocating an address space
 loading the program into the allocated address space,
and
 passing the process description to the scheduler
• Process states
 ready to run
 running
 wait
Process spawning (independent processes)

B C

D E
The concept of thread

• smaller chunks of code (lightweight)

• threads are created within and belong to process
• for parallel thread processing, scheduling is
performed on a per-thread basis
• finer-grain, less overhead on switching from thread to
thread
Single-thread process or multi-thread (dependent)

Thread tree

Process

Threads
Three basic methods for creating and terminating threads

1. unsynchronized creation and unsynchronized

termination
• calling library functions: CREATE_THREAD,
START_THREAD
2. unsynchronized creation and synchronized
termination
• FORK and JOIN
3. synchronized creation and synchronized
termination
• COBEGIN and COEND
Processes and threads in languages

• Black box view: T: thread

T2 T0 T1 T1 T2 T0 . . . Tn

FORK COBEGIN

...
FORK

JOIN

COEND

JOIN

(a) (b)
The concepts of concurrent and parallel execution

Client Sever Client Sever

t t
Sequential nature Simultaneous nature

N-client 1-server model with concurrent execution

Main aspects of the scheduling policy

Scheduling policy

Pre-emption rule Selection rule

Whether servicing a client can How clients from the

be interrupted and if so on what competing clients will be
occasions selected for service
Basic pre-emption schemes
Pre-emption rule

Non pre-emptive Pre-emptive

Time-shared Priotized

Priority

Client Sever Client Sever Client Sever

The concepts of concurrent and parallel execution
N-client N-server model

Synchronous
(lock step) Asynchronous

Client Sever Client Sever

Concurrent and parallel programming languages

Classification of programming languages

Languages 1_client N_client 1_client N_client

1-server 1-server N-server N-server
model mode model model
sequential + - - -

concurrent - + - -

Data-parallel - - + -

Parallel - + - +
Types and levels of parallelism

• Available and utilized parallelism

 available: in program or in the problem solutions
 utilized: during execution

• Types of available parallelism

 Functional parallelism
 arises from the logic of a problem solution
 Data parallelism
 arises from data structures
Levels of available functional parallelism

• Parallelism at the instruction level (fine-grained parallelism)

• Parallelism at the loop level (middle-grained parallelism)
• Parallelism at the procedure level (middle-grained parallelism)
• Parallelism at the program level (coarse-grained parallelism
Available and utilized levels of functional parallelism

Available levels Utilized levels

User (program) level User level

2
Procedure level Process level

Loop level Thread level 1

Instruction level Instruction level

1: Exploited by architectures
2: Exploited by means of operating systems
Utilization of functional parallelism

• Available parallelism can be utilized by

 architecture
 instruction-level parallel architectures (ILP architectures)
 compilers
 parallel optimizing compiler
 operating system
 multitasking
Concurrent execution models

• User level --- Multiprogramming, time sharing

• Process level --- Multitasking
• Thread level --- Multi-threading

level of granularity
Utilization of data parallelism

• by using data-parallel architecture

Classification of parallel architectures

• Flynn’s classification

 SISD (Single Instruction Single Data)

 SIMD (Single Instruction Multiple Data)
 MISD (Multiple Instruction Single Data)
 MIMD (Multiple Instruction Multiple Data)
Proposed Classification
Parallel architectures
PAs

Data-parallel architectures Function-parallel architectures

Instruction-level Thread-level Process-level

PAs
PAs PAs

DPs
ILPS MIMDs

Vector Associative SIMDs Systolic Pipelined VLIWs Superscalar Distributed Shared

and neural architecture processors processors memory memory
architecture architecture MIMD (multi-
(multi-computer) Processors)
Basic parallel technique

• Pipelining (time)
 a number of functional units are employed in sequence
to perform a single computation
 a number of steps for each computation

• Replication (space)
 a number of functional units perform multiple
computations simultaneously
 more processors
 more memory
 more I/O
 more computers
Pipelining and replication in parallel computer architecture

Pipelining Replication
Vector processors +
Systolic arrays + +
SIMD (array) processor +
Associative processors +
Pipelined processors +
VLIW processors +
Superscalar processors + +
Multi-threaded machines + +
Multicomputers + +
Multiprocessors +

Brain Filler Design Guide
No ratings yet
Brain Filler Design Guide
37 pages
3HAC042927 en
No ratings yet
3HAC042927 en
58 pages
Introduction To Parallel Processing: Asim Munir
No ratings yet
Introduction To Parallel Processing: Asim Munir
19 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
24 pages
Lecture #1 - Class-1
No ratings yet
Lecture #1 - Class-1
17 pages
Flynns
No ratings yet
Flynns
41 pages
CA Classes-21-25
No ratings yet
CA Classes-21-25
5 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Lec2 ParallelProgrammingPlatforms
No ratings yet
Lec2 ParallelProgrammingPlatforms
26 pages
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
No ratings yet
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
43 pages
Parallelism
No ratings yet
Parallelism
22 pages
HPC - Unit-1 Insem Notes
No ratings yet
HPC - Unit-1 Insem Notes
76 pages
Introduction To Parallel Processing: Unit-2
No ratings yet
Introduction To Parallel Processing: Unit-2
32 pages
PAG Unit1
No ratings yet
PAG Unit1
64 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
Parallel Processing Parallel Processing
No ratings yet
Parallel Processing Parallel Processing
64 pages
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
No ratings yet
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
70 pages
Multi Threading
No ratings yet
Multi Threading
168 pages
Platform Technologies Reviewer
No ratings yet
Platform Technologies Reviewer
9 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
98 pages
Concurrent Programming With Threads: Rajkumar Buyya
No ratings yet
Concurrent Programming With Threads: Rajkumar Buyya
168 pages
HPC Unit 1
No ratings yet
HPC Unit 1
65 pages
Unit 4
No ratings yet
Unit 4
42 pages
COA - Unit 4
No ratings yet
COA - Unit 4
84 pages
Parallel
No ratings yet
Parallel
9 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
19 pages
Cs Intro Os
No ratings yet
Cs Intro Os
58 pages
Notes - 1095 - Unit-1 - C.A. (MCA-20-31) - Full
No ratings yet
Notes - 1095 - Unit-1 - C.A. (MCA-20-31) - Full
54 pages
EE6304 Lecture12 TLP
No ratings yet
EE6304 Lecture12 TLP
70 pages
Computer Organization: - by Rama Krishna Thelagathoti (M.Tech CSE From IIT Madras)
No ratings yet
Computer Organization: - by Rama Krishna Thelagathoti (M.Tech CSE From IIT Madras)
118 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
100 pages
Concurrency: CS2403 Programming Languages
No ratings yet
Concurrency: CS2403 Programming Languages
44 pages
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
No ratings yet
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
48 pages
Module 2
No ratings yet
Module 2
127 pages
Lecture1 Introduction PDF
No ratings yet
Lecture1 Introduction PDF
43 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
Cloud Computing - Lecture 3
No ratings yet
Cloud Computing - Lecture 3
22 pages
Lec6 - TLP Data Dependence Solutions
No ratings yet
Lec6 - TLP Data Dependence Solutions
20 pages
08 Parallel Algorithms Approches
No ratings yet
08 Parallel Algorithms Approches
12 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Chapter 02 - Asynchronous and Parallel Programming in
No ratings yet
Chapter 02 - Asynchronous and Parallel Programming in
55 pages
OS-Chap2-2021 01 22
No ratings yet
OS-Chap2-2021 01 22
103 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
33 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
Parallel Computing
100% (1)
Parallel Computing
12 pages
Chapter 2-3-4 For Med Distributive
No ratings yet
Chapter 2-3-4 For Med Distributive
124 pages
Multiprocessing Vs Multithreading 2
No ratings yet
Multiprocessing Vs Multithreading 2
16 pages
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
No ratings yet
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
17 pages
Lecture 2 PP
No ratings yet
Lecture 2 PP
15 pages
CC Unit 1.2
No ratings yet
CC Unit 1.2
39 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
Introduction To Parallel Processing Architecture
No ratings yet
Introduction To Parallel Processing Architecture
31 pages
CH18 COA11e
No ratings yet
CH18 COA11e
40 pages
L03 Architecture Memory
No ratings yet
L03 Architecture Memory
56 pages
Advanced Computer Architecture: The Architecture of Parallel Computers
No ratings yet
Advanced Computer Architecture: The Architecture of Parallel Computers
44 pages
Advanced Computer Architecture: The Architecture of Parallel Computers
No ratings yet
Advanced Computer Architecture: The Architecture of Parallel Computers
44 pages
Unit 5
No ratings yet
Unit 5
66 pages
ch.9 Pipeline MoDIFIED
No ratings yet
ch.9 Pipeline MoDIFIED
76 pages
Hospital Information Sytem
100% (1)
Hospital Information Sytem
28 pages
WPC Practical No. 19
No ratings yet
WPC Practical No. 19
3 pages
Chapter 4
No ratings yet
Chapter 4
3 pages
Re-Ordering of Packets Using Retransmission Timer Abstract
100% (1)
Re-Ordering of Packets Using Retransmission Timer Abstract
5 pages
Unaohm EP300
0% (1)
Unaohm EP300
2 pages
PH Controller 7685 AWE
No ratings yet
PH Controller 7685 AWE
43 pages
How To Use RSPO2010
No ratings yet
How To Use RSPO2010
7 pages
Deister Bulletin 300 D - 1 PDF
No ratings yet
Deister Bulletin 300 D - 1 PDF
16 pages
Light Tester Manual
No ratings yet
Light Tester Manual
46 pages
CS61C Final: University of California, Berkeley College of Engineering
No ratings yet
CS61C Final: University of California, Berkeley College of Engineering
10 pages
Lecture How To Write Program
No ratings yet
Lecture How To Write Program
10 pages
DVP S336, Manual
No ratings yet
DVP S336, Manual
6 pages
KX TG3712
No ratings yet
KX TG3712
56 pages
EGR Valve Emulator EML810/24 Installation Guide: Brown
No ratings yet
EGR Valve Emulator EML810/24 Installation Guide: Brown
1 page
Sony Ericsson W900i Disassembly
No ratings yet
Sony Ericsson W900i Disassembly
7 pages
Pentagon Digital Podium MDP 3500
No ratings yet
Pentagon Digital Podium MDP 3500
2 pages
B. Part Catalog (Katalog Suku Cadang) GSX S150
No ratings yet
B. Part Catalog (Katalog Suku Cadang) GSX S150
100 pages
To Fix Missing Lava Textures On Therum
No ratings yet
To Fix Missing Lava Textures On Therum
5 pages
Platinum Hospital Furniture (E-Katalog)
No ratings yet
Platinum Hospital Furniture (E-Katalog)
17 pages
Sap Oracle Upgrade
No ratings yet
Sap Oracle Upgrade
4 pages
Term Paper - Humera
No ratings yet
Term Paper - Humera
12 pages
BumbleBee-Setup-Guide-HD-DJI (Feb-28)
No ratings yet
BumbleBee-Setup-Guide-HD-DJI (Feb-28)
23 pages
Security Certifications Roadmap
No ratings yet
Security Certifications Roadmap
2 pages
High-Power, High-Current Grid Resistors
No ratings yet
High-Power, High-Current Grid Resistors
2 pages
How To Use A SCART TV As A Monitor For MAME
No ratings yet
How To Use A SCART TV As A Monitor For MAME
10 pages
1st Quarter Daily Lesson Log
No ratings yet
1st Quarter Daily Lesson Log
5 pages
Problemas de Dinamica de Maquinas
No ratings yet
Problemas de Dinamica de Maquinas
60 pages
Understanding & Programming The PIC16C84: A Beginners' Tutorial Jim Brown
No ratings yet
Understanding & Programming The PIC16C84: A Beginners' Tutorial Jim Brown
35 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CH03

Uploaded by

CH03

Uploaded by

Advanced Computer Architecture

Dr. Saima Farhan

Fall’ 2018 Semester

Processing of control transfer instructions:

Parallel computing and Cache coherence:

Shared memory Multiprocessor:

System Interconnect Architectures:

Data Parallel architecture:

Recent architectural trends:

• The concept of program

 ordered set of instructions (programmer’s view)

• OS view, process relates to execution

• smaller chunks of code (lightweight)

1. unsynchronized creation and unsynchronized

• Black box view: T: thread

Client Sever Client Sever

N-client 1-server model with concurrent execution

Pre-emption rule Selection rule

Whether servicing a client can How clients from the

Non pre-emptive Pre-emptive

Client Sever Client Sever Client Sever

Client Sever Client Sever

Classification of programming languages

Languages 1_client N_client 1_client N_client

• Available and utilized parallelism

• Types of available parallelism

• Parallelism at the instruction level (fine-grained parallelism)

Available levels Utilized levels

User (program) level User level

Loop level Thread level 1

Instruction level Instruction level

• Available parallelism can be utilized by

• User level --- Multiprogramming, time sharing

• by using data-parallel architecture

 SISD (Single Instruction Single Data)

Data-parallel architectures Function-parallel architectures

Instruction-level Thread-level Process-level

Vector Associative SIMDs Systolic Pipelined VLIWs Superscalar Distributed Shared

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.