0% found this document useful (0 votes)
2 views

CSE211 Computer Architecture

Uploaded by

kartavya9878
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

CSE211 Computer Architecture

Uploaded by

kartavya9878
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

CSE211

Computer
Architecture
Modules 14 to 21
Multi-threading
Multithreading allows multiple threads to execute simultaneously,
enhancing parallelism and resource utilization. It can be categorized into
fine-grain and coarse-grain multithreading.
Simultaneous multithreading (SMT) enables issuing instructions from
different threads into various functional units at the same time,
maximizing the use of processor resources.
SMT is a hardware technique that allows multiple threads to share the
execution resources of a single processor core. This is achieved by
interleaving the instruction execution of different threads.
By allowing multiple threads to share the execution resources, SMT can
increase the utilization of the processor and improve overall
performance.
While increasing the number of threads in SMT can enhance parallelism,
it is crucial to balance the number of threads with the architecture's
ability to manage resources effectively.
Parallelism vs
Synchronization
Parallel programming allows multiple programs or threads to run
simultaneously, which is essential for improving performance in
modern computer architectures.
Synchronization is crucial for coordinating communication
between concurrent processes, ensuring that shared resources
are accessed safely.
The producer-consumer model illustrates how one entity
produces data while another consumes it, highlighting the need
for effective communication and resource management.
Mutual exclusion ensures that only one processor accesses a
shared resource at a time, preventing conflicts and ensuring data
integrity. To implement we use strategies like:
Exclusive Access
Lock Mechanisms
Avoiding Race Conditions
Synchronization
Producer consumer problem
In a producer-consumer scenario, a producer generates
values while consumers read and process those values.
When there are two consumers, issues can arise if they
access shared data simultaneously.
Sequential consistency ensures that operations appear to
occur in a specific order, preventing reordering of reads
and writes, which is beneficial for maintaining data
integrity.
Producer: Generates a data item. Consumer: Checks if the buffer is
Adds the item to the buffer. empty.
If the buffer is full, the producer If the buffer is not empty, removes an
item from the buffer and processes
may be blocked until space it.
becomes available. If the buffer is empty, the consumer
may be blocked until a new item is
added.
Mutual exclusion
Understanding Mutual Exclusion
Mutual exclusion is essential for preventing multiple processes from
accessing shared resources simultaneously, which can lead to
inconsistencies.
Atomic operations are crucial for implementing mutual exclusion,
allowing operations to be completed without interruption from other
processes.
Atomic Operations and Their Implementation
The test and set operation is a fundamental atomic operation that
checks a memory address and modifies it atomically, ensuring that no
other operations interfere during this process.
More advanced atomic operations, such as compare and swap, enhance
functionality by allowing conditional updates based on the current value
in memory.
Sequential consistency
Sequential Consistency ensures that the execution sequence of
instructions from all processors appears as a valid interleaving of
their individual instruction orders.
It is a strong model that guarantees that all processors see the
same order of operations, which is not typically implemented in
modern computers due to performance constraints.
Examples of Valid and Invalid Orders
Valid sequentially consistent orders can include various
interleavings, such as executing instructions from different
processors in a way that respects their individual order.
An invalid order occurs when the relative order of operations from
a single processor is violated, leading to inconsistencies in the
observed results.
Issues in Sequential Consistency
Performance Overhead
Hardware Complexity
Programming Complexity
Practical Limitations
Distributed Systems
True sequential consistency is challenging to achieve,
especially with caches, as data visibility between
processors becomes a concern.
Inspiration and creativity
Definition of Race Conditions: A race condition occurs when two or more threads or
processes access shared data and try to change it at the same time. The final outcome
depends on the timing of their execution, which can lead to unpredictable results.
Role of Sequential Consistency: Sequential consistency provides a model that ensures
all memory operations appear to occur in a specific order. This means that if a program
adheres to sequential consistency, the operations from different threads will be
interleaved in a way that respects the order of operations from each individual thread.
Prevention of Race Conditions: By enforcing a sequentially consistent memory model,
the likelihood of race conditions is reduced. Since all threads see the same order of
operations, it becomes easier to reason about the state of shared data and avoid
conflicts.
Simplified Reasoning: With sequential consistency, programmers can assume that
operations will execute in a predictable manner, making it easier to identify potential
race conditions and implement appropriate synchronization mechanisms.
Weak Models and Race Conditions: In contrast, weaker memory models may allow for
out-of-order execution and different visibility of operations, increasing the risk of race
conditions. Programmers must be more cautious and implement additional
synchronization to ensure correctness.
Locks
Locks (mutexes) allow mutual exclusion,
ensuring that only one process can execute a
critical section of code at any given time.
Mutual Exclusion: Locks ensure that only one
thread can access a critical section of code at
a time. This prevents race conditions where
multiple threads might try to read or write
shared data simultaneously.
Synchronization: By locking a resource, a
thread can safely perform operations without
interference from other threads, ensuring
data integrity.
Semaphores provide
Semaphores
a more flexible
approach, allowing a specified number of
processes to enter a critical section
concurrently, which is useful in scenarios
with multiple resources.

Semaphores
Controlled Access: Semaphores allow a
specified number of threads to access a
resource concurrently.
Flexibility: Unlike locks, which only allow
one thread at a time, semaphores can be
configured to permit a certain number of
threads (N) to enter a critical section.
Memory fences and models
Memory Fences and Their Importance
Memory fences (or barriers) are introduced to ensure that
certain memory operations are completed before others
begin, helping to maintain order and consistency.
Different types of memory fences exist, such as load memory
fences and directional memory fences, which provide varying
levels of control over memory operations.
Weak Memory Models
Most modern processors implement weaker memory models
rather than strict sequential consistency, allowing for
performance optimizations through reordering.
Examples of memory ordering models include total store
ordering, partial store ordering, and weak ordering, each with
specific rules about how loads and stores can be reordered.
Memory Bus
The memory bus is a type
of computer bus, usually
in the form of a set of wires
or conductors which
connects electrical
components and allow
transfers of data and
addresses from the main
memory to the central
processing unit (CPU) or a
memory controller.
Bus- based multiprocessor
A bus-based multiprocessor system is a type of
parallel computing architecture where
multiple processors share a common bus to
communicate with each other and access
shared memory.
Key Components of a Bus-Based
Multiprocessor:
Processors: Multiple processors, each with
its own registers and local cache.
Shared Memory: A common memory area
accessible to all processors.

Bus: A communication channel that


connects the processors and the shared
memory.
Cache Memory: High-speed memory that
stores frequently accessed data for each
processor.
Message passing
Shared Memory Architecture
In shared memory systems, one core can write data to a
memory address, and another core can read from that
address without needing to know which core will read it in
the future.
This model allows for implicit communication, but it often
requires locking mechanisms to ensure data consistency
between writes and reads.
Explicit Message Passing
Explicit message passing requires a sender to specify a
destination when sending data, using an API that includes
send and receive functions.
The receive function can be designed to accept data from any
source or a specific source, allowing for more controlled
communication.
Memory in Multiprocessor system
Multi-core bus systems August 2026

By placing two or more processor cores on the same device,


it can use shared components -- such as common internal
buses and processor caches -- more efficiently.
Shared memory vs message passing
Shared Memory
Communication Method: Implicit communication through loads and stores to
shared memory addresses.
Destination Knowledge: The sender does not need to know which core or process
will read the data.
Synchronization: Requires explicit synchronization mechanisms (like locks and flags)
to prevent race conditions.
Memory Access: Memory is shared among all cores or processes, allowing for easy
access to shared data structures.
Explicit Message Passing
Communication Method: Explicit communication using send and receive functions.
Destination Knowledge: The sender must specify the destination when sending
data.
Synchronization: Synchronization is built into the messaging model, as sending and
receiving messages inherently creates a producer-consumer relationship.
Memory Access: Memory is typically private to each process or core, meaning data
must be sent explicitly between them.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy