2.1 Memory System (S.K)
2.1 Memory System (S.K)
Presented By
Prof. Dr. Sunil Karforma
Dept of Computer Science,Burdwan University,West Bengal
Email: sunilkarforma@yahoo.com
bit, Byte,word
• Relationship:
• Bits to Bytes:
• 1 byte = 8 bits
• Bytes to Words:
• The relationship between bytes and words depends on the
architecture:
• For a 16-bit architecture: 1 word = 2 bytes
• For a 32-bit architecture: 1 word = 4 bytes
• For a 64-bit architecture: 1 word = 8 bytes
• Summary:
• 1 byte = 8 bits
• 1 word = n bytes, where n depends on the architecture (e.g.,
2 bytes for 16-bit, 4 bytes for 32-bit, 8 bytes for 64-bit).
Contents
⚫ Memory is used for storing programs and data that are required to
perform the fundamental task of computer system.For CPU to
operate at its maximum speed, it required an uninterrupted and high
speed access to these memories that contain programs and
data.Some of the criteria need to be taken into consideration while
deciding which memory is to be used which are: cost,speed,memory
access time,data transfer rate and the reliability.
⚫ Generally, memory/storage is classified into 2 categories:
⚫ Volatile Memory: This loses its data, when power is switched off.
⚫ Non-Volatile Memory: This is a permanent storage and does not
lose any data when power is switched off.
How Memories Attached to CPU
Auxiliary Memory
Magnetic
Tapes Main
I/O Processor Memory
Magnetic
Disks
Cache
CPU Memory
Memory Hierarchy
⚫ An ideal memory would be fast, large, and inexpensive.In the Computer System
Design, memory Hierarchy is an enhancement to organize the memory such that it
can minimize the access time.The design constraints on a computer’s memory can be
summed up by three questions: How much? How fast? How expensive? There is a
trade-off among the three key characteristics of memory: namely, capacity, access
time, and cost. A variety of technologies are used to implement memory systems,
and across this spectrum of technologies, the following relationships hold:
⚫ Faster access time, greater cost per bit
⚫ Greater capacity, smaller cost per bit
⚫ Greater capacity, slower access time
⚫ The designer would like to use memory technologies that provide for large-capacity
memory, both because the capacity is needed and because the cost per bit is
low.However, to meet performance requirements, the designer needs to use
expensive, relatively lower-capacity memories with short access times.The solution
of this problem does not rely on a single memory component or technology, but to
employ a memory hierarchy. A typical hierarchy is illustrated in next figure.
Memory Hierarchy Cont...
Processor
Increasing
Registers
Increasing Increasing
Size Cost per bit
Speed
Primary
Cache(L1)
Secondary
Cache(L2)
Main Memory
Magnetic Disk
Secondary
Memory
Memory Hierarchy Cont...
We can infer the following characteristics of Memory Hierarchy Design from
previous figure:
⚫ Capacity:It is the global volume of information the memory can store. As we
move from top to bottom in the hierarchy, the capacity increases.
⚫ Access Time: It is the time interval between the read/write request and the
availability of the data. As we move from top to bottom in the Hierarchy, the
access time increases.
⚫ Cost per bit: As we move from bottom to top in the Hierarchy, the cost per
bit increases i.e. Internal Memory is costlier than External Memory.
⚫ Performance: Earlier when the computer system was designed without
Memory Hierarchy design, the speed gap increases between the CPU
registers and Main Memory due to large difference in access time. This
results in lower performance of the system and thus, enhancement was
required. This enhancement was made in the form of Memory Hierarchy
Design because of which the performance of the system increases.
CPU Memory Interaction
The connections between the CPU and Memory are shown in following figure.As an
example of how data can be stored in memory, in order to illustrate how the three
main components of the Central Processing Unit interact i.e.; the Control Unit, the
ALU and Memory, we will now look at how a CPU;accepts keyboard input, adds
two numbers and prints the result, in the next slide. CPU and Memory are connected
through different types of buses,a high-speed internal connection. There are three
kind of buses:
⚫ Address bus - carries memory addresses from the processor to other components
such as primary storage and input/output devices. The address bus is unidirectional.
⚫ Data bus - carries the data between the processor and other components. The data
bus is bidirectional.
⚫ Control bus - carries control signals from the processor to other components. The
control bus also carries the clock's pulses. The control bus is unidirectional.
CPU Memory Interaction Architecture
⚫ Bus Unit - Handles communication with devices external to the processor chip.
⚫ Code Prefetch Unit -Fetches instructions from memory before the processor
actually requests them
⚫ Instruction Decode Unit - Decodes the instruction prior to passing it to the
Execution Unit for execution.
⚫ Execution Unit - Executes the instruction.
⚫ Control Unit - Coordinates the steps necessary to complete each instruction, and
tells each part of the Execution unit what to do and when.
⚫ Protection Test Unit - Makes sure the operations performed by the Execution Unit
are legal.
⚫ Registers - Working memory for the execution unit.
⚫ Memory Management Unit (MMU) - The virtual address of an instruction is
converted to a physical address.
CPU Memory Interaction(Example: Addition between Two Numbers )
⚫ Steps are as follows:
⚫ The prefetch unit asks the Bus Interface Unit to retrieve the instruction to add two numbers.At the same time the
segment and paging units convert the instruction's location from a virtual address to physical address, for the Bus
Interface unit.
⚫ The Bus Interface Unit gets the instruction from RAM and sends it to the prefetch unit.
⚫ The Prefetch Unit forwards the instruction to the Decode Unit which forwards it to the Execution Unit.
⚫ The Control Unit sends a virtual address of the first number to be added (stored in RAM) to the Protection Test
Unit.
⚫ The Protection Test Unit verifies that the control unit can access the first number and forwards it to the segment
and paging units, where the virtual address is translated into a physical address for the bus interface unit.
⚫ The Bus Interface unit retrieves the number stored at that address in RAM. This number then travels through the
Protection Test Unit to the Execution Unit and is stored in one of the chip's internal registers.
⚫ The Arithmetic Unit adds the second number retrieved from RAM and the first from the internal register.
⚫ The Control unit tells the Bus interface unit to store the number in RAM. The Memory Management Unit
translates the virtual address to a physical RAM address, completing the instruction.
Cache Organization
⚫ Cache memory is organized to give memory speed approaching that of the fastest
memories available.There is a relatively large and slow main memory together with
a smaller, faster cache memory.When the processor attempts to read a word of
memory, a check is made to determine if the word is in the cache. If so, the word is
delivered to the processor. If not, a block of main memory, consisting of some fixed
number of words, is read into the cache and then the word is delivered to the
processor. Because of the phenomenon of locality of reference.
Word Block
Transfer Transfer
Because there are fewer cache lines than main memory blocks, an algorithm is
needed for mapping main memory blocks into cache lines. It is needed for
determining which main memory block currently occupies a cache line.Three
techniques can be used: direct, associative, and set associative. We examine
each of these in turn.
Direct Mapping
In direct mapping,a particular block of main memory can map only to a particular line of
the cache.The line number of cache to which a particular block can map is given by-
Cache line number = ( Main Memory Block Address ) Modulo (Number of lines in Cache)
Example- Block 0
Consider cache memory is divided into ‘n’ number of
Tag Line 0
lines.Then, block ‘j’ of main memory can map to line Block 1
number (j mod n) only of the cache.In direct Tag Line 1
mapping, the physical address is divided as:
Block j
Tag
Line j mod
Tag n Block
Tag Line Number Block Offset j+1
Cache Block
Physical Address
j+2
Cache Address
Main Memory
Direct Mapping Example
r bit w bit
In fully associative mapping,a block of main memory can map to any line of the cache that is
freely available at that moment.This makes fully associative mapping more flexible than direct
mapping.Here,all the lines of cache are freely available .Thus, any block of main memory can
map to any line of the cache.Had all the cache lines been occupied, then one of the existing
blocks will have to be replaced.
Block 0
Tag Line 0
Block 1
Tag Line 1
In fully associative mapping, the physical address
is divided as- Block j
Tag
Line j mod
Tag n Block
Tag/ Line Number Block Offset j+1
Cache Block
Physical Address j+2
Main Memory
Associative Mapping Example
s bit w bit
⚫ Address length = (s + w) bits
The cache can hold 64 KBytes. Data ⚫ .Number of addressable units= 2^(s+w) words or bytes.
are transferred between main memory ⚫ Block size = line size = 2^w words or bytes
and the cache in blocks of 4 bytes ⚫ Number of blocks in main memory = 2^(s+w) / 2^w =
each.This means that the cache is 2^s
⚫ Number of lines in cache = undetermined
organized as 16K = 2^14 lines of 4
• Size of tag = s bit
bytes each. The main memory consists
of 16 Mbytes, with each byte directly
Solution:
addressable by a 24-bit address (2^24 Address Length(s+w)=24 bit.
= 16M). Thus, for mapping purposes, Block Offset(w)=2 bit
we can consider main memory to Size of Tag=(s-r)=(s+w)-w=24-2=22 bit
consist of 4M blocks of 4 bytes each.
In k-way set associative mapping,Cache lines are grouped into sets where each set
contains k number of lines.A particular block of main memory can map to only
one particular set of the cache.However, within that set, the memory block can
map any cache line that is freely available.
The set of the cache to which a particular block of the main memory can map is
given by:
d bit w bit
s bit
Consider a 2-way set associative mapped cache of size 16 KB with block size 256
bytes. The size of main memory is 128 KB. Find the physical address division:
Given-
Set size(k) = 2, Cache memory size = 16 KB, Block size = Frame size = Line size = 256 bytes
and Main memory size = 128 KB.
Complex Instruction Set Architecture (CISC) :The main idea is that a single
instruction will do all loading, evaluating and storing operations just like a multiplication
command will do stuff like loading data, evaluating and storing it, hence it’s complex.
Characteristic of CISC : Complex instruction, hence complex instruction
decoding.Instruction are larger than one word size.Instruction may take more than single
clock cycle to get executed.Less number of general purpose register as operation get
performed in memory itself.Complex Addressing Modes.More Data types
Difference between RISC and CISC
RISC CISC
Uses only hardwired control unit Uses both hardwired and micro
programmed control unit
It uses fixed sized instructions In this case variable sized
instructions are uesd
Can perform only REG to REG Can perform REG to REG or REG to
arithmetic operations MEM or MEM to MEM arithmetic
operations
Requires more number of registers as Requires less number of registers as
here multiple register sets are present CISC has only a single register set
A instruction execute in single clock Instruction take more than one clock
cycle cycle
In RISC, the decoding of instructions In CISC, decoding of instructions is
is simple. complex
RISC focuses on software CISC focuses on hardware
Virtual Memory
In most modern computer systems, the physical main memory is not as large as the ad-
dress space of the processor.If a program does not completely fit into the main memory,
the parts of it not currently being executed are stored on a secondary storage device,
typically a magnetic disk. As these parts are needed for execution, they must first be
brought into the main memory, possibly replacing other parts that are already in the
memory. These actions are performed automatically by the operating system, using a
scheme known as virtual memory.
The binary addresses that the processor issues for either instructions or data are called
virtual or logical addresses. These addresses are translated into physical addresses by a
combination of hardware and software actions. If a virtual address refers to a part of the
program or data space that is currently in the physical memory, then the contents of the
appropriate location in the main memory are accessed immediately. Otherwise, the
contents of the referenced address must be brought into a suitable location in the
memory before they can be used.
Virtual Memory Organization
⚫ “Computer Organization and Design:The Hardware/Software Interface” by David A Patterson and John L
Hennessy.
⚫ “COMPUTER ORGANIZATION AND EMBEDDED SYSTEMS” by Carl Hamacher and et. al.
⚫ http://www.virtualmv.com/virtualme/vme_mv/v2/v2com/v2kb/hw/hwcmin00.htm
⚫ http://www.sdcmsmzn.com/notes/mohit/MemoryOrganization.pdf
⚫ https://www.youtube.com/watch?v=leWKvuZVUE8&list=PL1A5A6AE8AFC187B7&ab_channel=nptelhrd
⚫ https://bob.cs.sonoma.edu/IntroCompOrg-RPi/sec-cpuinteract.html