0% found this document useful (0 votes)
12 views60 pages

Chapter 5

Uploaded by

mohamed hassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views60 pages

Chapter 5

Uploaded by

mohamed hassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

1.

MEMORY TYPES

2
1. RAM (Random Access Memory)
 Volatile, i.e. retains data as long as power is applied
 Temporarily stores data
 Used to store currently running programs

2. ROM (Read Only Memory)


 Non-volatile
 Permanent data storage
 Used for computer BIOS (basic input output system)

3
3. Hard Drive
 Introduced in 1956 with a capacity of 5MB at a cost of $50,000 (now
several TBs are at <$200)
 Mechanical drives that rely on a pin moving to magnetically read data
 Lifespan: Theoretically infinite read/write capability

4. Solid State Drive (SSD)


 Introduced in 1976
 No mechanical parts (they can randomly access data)
 Costs more per GB than HD (but typically worth it for high performance)
 Lifespan: ~10 years

4
HD: 0.1-1.7 MBps SSD: 50-250 MBps

5
A. PROM: one-time field programmable

B. EPROM: reprogrammed more than one by


exposing to UV rays for 20 minutes.

C. EEPROM: electrically erasable programmable


ROM through applying higher than normal voltage.

Note: Flash memory (usb/Sdcards/SSD) is a form of EEPROM that


uses normal PC volts for erasing and reprogramming
6
7
SRAM DRAM
Static RAM Dynamic RAM

FF: 6trs (less dense) C & trans. (more dense)

No refreshing required Requires refreshing

More expensive Less expensive

faster slower

CACHE MAIN MEMORY (RAM)

Up to 32MB up 32GB

8
Virtual memory is a storage scheme that
provides user an illusion of having a very
big main memory. This is done by treating
a part of secondary memory as the main
memory.

 More virtual RAM allowing more/larger


programs to run simultaneously

 Slower than having larger physical RAM,


e.g. takes more time in switching between
applications.

9
What happens when a large file is transferred between
memories?

10
 DMA Controller is a hardware device that allows
I/O devices to directly access memory with less
participation of the processor.

11
 Burst Mode: In burst mode, the full data block is transmitted in a continuous
sequence. Burst mode totally hacks the system bus causing the CPU to be
inactive for a considerable long time.
 Cycle Stealing Mode: used in a system where the CPU cannot be disabled for
the length of time required for the burst transfer mode. In the cycle stealing
mode, the DMA controller obtains the access to the system bus by using the BR
(Bus Request) and BG (Bus Grant) signals. These two signals control the
interface between the CPU and the DMA controller.
 Transparent Mode: takes the longest time to transfer data blocks, but it is also
the most efficient mode in terms of overall system performance. In transparent
mode, the DMA controller transfers data only when the CPU performs
operations that do not use the system buses.
 CPU never stops executing its programs
 H/W needs to determine when CPU is not using the system buses,
which can be complicated.

12
2 MEMORY ADDRESSING

13
How are the various RAM chips addressed?

14
a) Address lines (AN…A0)

b) Data lines (OK…O0):


typically 8 bits

c) Read/Write Enable

d) Chip Select (CS)

15
a) Address lines:??

b) Data lines:??

c) Read/Write Enable

d) Chip Select

16
Assume a simple microprocessor with 10 address lines (1KB
memory) and we wish to implement all its memory space using
128x8 memory chips.
a) How many memory chips are needed?
b) How many address lines are needed to access any word in the
total memory?
c) How many address lines are common to all chips?
d) How many address line must be decoded for chip select?
e) What is the size of the decoder?

17
Assume a simple microprocessor with 10 address lines (1KB
memory) and we wish to implement all its memory space using
128x8 memory chips.
a) How many memory chips are needed? 8
b) How many address lines are needed to access any word in the
total memory? 10
c) How many address lines are common to all chips? 7
d) How many address line must be decoded for chip select? 3
e) What is the size of the decoder? 3x8

18
19
20
A computer system uses RAM chips of 256x8. The computer system
require 4K bytes of RAM
a) How many memory chips are needed?
b) How many address lines are needed to access any word in the
total memory?
c) How many address lines are common to all chips?
d) How many address line must be decoded for chip select?
e) What is the size of the decoder?

21
A computer system uses RAM chips of 256x8. The computer system
require 4K bytes of RAM
a) How many memory chips are needed? 16
b) How many address lines are needed to access any word in the
total memory? 12
c) How many address lines are common to all chips? 8
d) How many address line must be decoded for chip select? 4
e) What is the size of the decoder? 4x16

22
A computer system has 20 address bits. It is required to design a
64kx8 ROM memory using 8kx8 ROM chips (IC No.: 2764) for
addresses F0000H to FFFFFH. (Slides)
a) How many ROM chips are needed? 8
b) How many address lines are common to all ROM chips? 13
c) How many total address lines for the required ROM? 16
d) What is the size of the decoder? 3x8

23
24
A 16-data bits and 18-address bits computer system requires 128kB
of RAM and 64kB of ROM using 32kB RAM chips and 16kB
ROM chips. (Slides)
a) How many RAM chips are needed? 4
b) How many address lines are common to all RAM chips? 15
c) How many ROM chips are needed? 4
d) How many address lines are common to all ROM chips? 14
e) What is the size of the decoder? 3x8

25
3 CACHE MEMORY

26
Registers B
1. Speed  Inside or
2. Cost  close to CPU
Cache (~MB )

RAM (~GB)
Outside
CPU
Hard Disk (~ TB)
27
Why do we use a memory hierarchy in computers?

Because we want memory to be:


 Fast
 Large
 Cheap
So, for storage we use SSD/HD which
is large and cheap, and for the
running programs we use the RAM
which is faster yet more expensive.

28
 Small amount of fast memory
 Sits between normal main memory and CPU
 May be located on CPU chip or module
 The principle of temporal locality says that if a
program accesses one memory address, there is a
good chance that it will access the same address
again (loops).

 The principle of spatial locality says that if a


program accesses one memory address, there is a
good chance that it will also access other nearby
addresses (arrays).
 CPU requests contents of memory location
 Check cache for this data
 If present, get from cache (fast)
 If not present, read required block from main
memory to cache
 Then deliver from cache to CPU
 Cache includes tags to identify which block of
main memory is in each cache slot
 Addressing
 Size
 Mapping Function
 Replacement Algorithm
 Write Policy
 Block Size
 Number of Caches
 Cost
 More cache is expensive
 Speed
 More cache is faster (up to a point)
 Checking cache for data takes time
 Cache of 64kByte
 Cache block of 4 bytes
 i.e. cache is 16k (214) lines of 4 bytes
 16MBytes main memory
 24 bit address
 (224=16M)
 Each block of main memory maps to only one
cache line
 i.e. if a block is in cache, it must be in one specific
place
 Address is in two parts
 Least Significant w bits identify unique word
 Most Significant s bits specify one memory block
 The MSBs are split into a cache line field r and a
tag of s-r (most significant)
Tag s-r Line or Slot r Word w
8 14 2

 24 bit address
 2 bit word identifier (4 byte block)
 22 bit block identifier
 8 bit tag (=22-14)
 14 bit slot or line
 No two blocks in the same line have the same Tag field
 Check contents of cache by finding line and checking Tag
Cache line Main Memory blocks held
0 0, m, 2m, 3m…2s-m

1 1,m+1, 2m+1…2s-m+1


m-1 m-1, 2m-1,3m-1…2s-1
 Address length = (s + w) bits
 Number of addressable units = 2(s+w) words or
bytes
 Block size = line size = 2w words or bytes
 Number of blocks in main memory = 2(s+w) / 2w =
2s
 Number of lines in cache = m = 2r
 Size of tag = (s – r) bits
 Simple
 Inexpensive
 Fixed location for given block
 If a program accesses 2 blocks that map to the same
line repeatedly, cache misses are very high.
 A main memory block can be loaded into any
line of cache
 Memory address is interpreted as tag and word
 Tag uniquely identifies block of memory
 Every line’s tag is examined for a match
 Cache searching gets expensive
Word
Tag 22 bit 2 bit
 22 bit tag stored with each 32 bit block of data
 Compare tag field with tag entry in cache to check for hit
 Least significant 2 bits of address identify which 8 bit word is
required from 32 bit data block
 Address length = (s + w) bits
 Number of addressable units = 2(s+w) words or
bytes
 Block size = line size = 2(w) words or bytes
 Number of blocks in main memory = 2(s+w) / 2(w) =
2s
 Size of tag = s bits
 Cache is divided into a number of sets
 Each set contains a number of lines
 A given block maps to any line in a given set
 e.g. Block B can be in any line of set i
 e.g. 2 lines per set
 2 way associative mapping
 A given block can be in one of 2 lines in only one set
Word
Tag 9 bit Set 13 bit 2 bit

 Use set field to determine cache set to look in


 Compare tag field to see if we have a hit
 No choice
 Each block only maps to one line
 Replace that line
 Least Recently used (LRU)
 e.g. in 2 way set associative
 Which of the 2 block is LRU?
 First in first out (FIFO)
 replace block that has been in cache longest
 Least frequently used
 replace block which has had fewest hits
 Random
 All writes go to main memory as well as cache
 Multiple CPUs can monitor main memory traffic
to keep local (to CPU) cache up to date
 Lots of traffic
 Slows down writes
 Updates initially made in cache only
 Update bit for cache slot is set when update
occurs
 If block is to be replaced, write to main memory
only if update bit is set
 Increased block size will increase hit ratio at first
 the principle of locality
 Hit ratio will decrease as block becomes even bigger
 Reduce number of blocks that fit in cache
 Data overwritten shortly after being fetched
 Each additional word is less local so less likely to be needed
 No definitive optimum value has been found
 8 to 64 bytes seems reasonable
 For HPC systems, 64- and 128-byte most common
 High logic density enables caches on chip
 Faster than bus access
 Frees bus for other transfers
 Common to use both on and off chip cache
 L1 on chip, L2 off chip in static RAM
 L2 access much faster than DRAM or ROM
 L2 often uses separate data path
 L2 may now be on chip
 Resulting in L3 cache
▪ Bus access or now on chip…
 One cache for data and instructions or two, one
for data and one for instructions
 Advantages of unified cache
 Higher hit rate
▪ Balances load of instruction and data fetch
▪ Only one cache to design & implement
 Advantages of split cache
 Eliminates cache contention between instruction
fetch/decode unit and execution unit
▪ Important in pipelining
Thank you all for your attention 
60

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy