Unit III
Unit III
Unit-III
The memory is organized in the form of a cell, each cell is able to be identified
with a unique number called address.
Each cell is able to recognize control signals such as “read” and “write”,
generated by CPU when it wants to read or write address.
Whenever CPU executes the program there is a need to transfer the instruction
from the memory to CPU because the program is available in memory.
To access the instruction CPU generates the memory request.
Memory Request: Memory request contains the address along with the control
signals.
For Example: When inserting data into the stack, each block consumes memory (
RAM) and the number of memory cells can be determined by the capacity of a
memory chip.
With the number of cells, the number of address lines required to enable one cell
can be determined.
Word Size: It is the maximum number of bits that a CPU can process at a time
and it depends upon the processor.
Word size is a fixed size piece of data handled as a unit by the instruction set or
the hardware of a processor.
Word size varies as per the processor architectures because of generation and the
present technology, it could be low as 4-bits or high as 64-bits depending on
what a particular processor can handle.
Word size is used for a number of concepts like Addresses, Registers, Fixed-
point numbers, Floating-point numbers.
Memory Hierarchy
The hierarchical arrangement of storage in current computer architectures is called the memory
hierarchy.
It is designed to take advantage of memory locality in computer programs.
Each level of the hierarchy is of higher speed and lower latency, and is of smaller size, than
lower levels.
Following diagram shows memory hierarchy in a modem computer system
Typically, a memory unit can be classified into two categories:
The memory unit that establishes direct communication with the CPU is
called Main Memory.
The main memory is often referred to as RAM (Random Access Memory).
The memory units that provide backup storage are called Auxiliary Memory.
For instance, magnetic disks and magnetic tapes are the most commonly used
auxiliary memories.
Apart from the basic classifications of a memory unit, the memory hierarchy
consists all of the storage devices available in a computer system ranging from the
slow but high-capacity auxiliary memory to relatively faster main memory.
The following image illustrates the components in a typical memory
hierarchy.
Auxiliary Memory
Auxiliary memory is known as the lowest-cost, highest-capacity and slowest-
access storage in a computer system.
Auxiliary memory provides storage for programs and data that are kept for long-
term storage or when not in immediate use.
The most common examples of auxiliary memories are magnetic tapes and
magnetic disks.
A magnetic disk is a digital computer memory that uses a magnetization process to
write, rewrite and access data.
For example, hard drives, zip disks, and floppy disks.
Magnetic tape is a storage medium that allows for data archiving, collection, and
backup for different kinds of data.
Main Memory
The main memory in a computer system is often referred to as Random Access
Memory (RAM).
This memory unit communicates directly with the CPU and with auxiliary
memory devices through an I/O processor.
The programs that are not currently required in the main memory are transferred
into auxiliary memory to provide space for currently used programs and data.
I/O Processor
The primary function of an I/O Processor is to manage the data transfers between
auxiliary memories and the main memory.
Cache Memory
The data or contents of the main memory that are used frequently by CPU are
stored in the cache memory so that the processor can easily access that data in a
shorter time.
Whenever the CPU requires accessing memory, it first checks the required data
into the cache memory.
If the data is found in the cache memory, it is read from the fast memory.
Otherwise, the CPU moves onto the main memory for the required data.
The cache memory is used for storing segments of programs currently being
executed in the CPU.
The I/O processor manages data transfer between auxiliary memory and main
memory.
The auxiliary memory has a large storage capacity is relatively inexpensive, but
has low access speed compared to main memory.
The cache memory is very small, relatively expensive, and has very high access
speed.
The CPU has direct access to both cache and main memory but not to auxiliary
memory.
Multiprogramming
Multiprogramming is a computer architecture technique where multiple programs
are concurrently executed by a computer's operating system.
The primary goal of multiprogramming is to maximize CPU utilization and overall
system efficiency.
In a multiprogramming system, several programs are loaded into the computer's
memory simultaneously, and the CPU is switched between them based on certain
scheduling algorithms.
Multiprogramming allows for better utilization of computer resources and
improved system throughput.
It also helps in keeping the CPU busy by allowing it to work on another program
when one program is waiting for external events such as user input or data from
storage.
Main Memory
The main memory acts as the central storage unit in a computer system.
It is a relatively large and fast memory which is used to store programs and data
during the run time operations.
The primary technology used for the main memory is based on semiconductor
integrated circuits.
The integrated circuits for the main memory are classified into two major units.
•The 8-bit bidirectional data bus allows the transfer of data either from memory to CPU during a read operation or from CPU to
memory during a write operation.
•The read and write inputs specify the memory operation, and the two chip select (CS) control inputs are for enabling the chip
only when the microprocessor selects it.
•The output generated by three-state buffers can be placed in one of the three possible states which include a signal equivalent to
logic 1, a signal equal to logic 0, or a high-impedance state.
The following function table specifies the operations of a 128 * 8 RAM chip.
From the functional table, we can conclude that the unit is in operation only when CS1 = 1 and CS2 = 0.
The bar on top of the second select variable indicates that this input is enabled when it is equal to 0.
ROM integrated circuit
The primary component of the main memory is RAM integrated circuit chips, but a
portion of memory may be constructed with ROM chips.
A ROM memory is used for keeping programs and data that are permanently
resident in the computer.
Apart from the permanent storage of data, the ROM portion of main memory is
needed for storing an initial program called a bootstrap loader.
The primary function of the bootstrap loader program is to start the computer
software operating when power is turned on.
ROM chips are also available in a variety of sizes and are also used as per the
system requirement.
The following block diagram demonstrates the chip interconnection in a 512 * 8 ROM chip.
•A ROM chip has a similar organization as a RAM chip. However, a ROM can only perform read
operation; the data bus can only operate in an output mode.
•The 9-bit address lines in the ROM chip specify any one of the 512 bytes stored in it.
•The value for chip select 1 and chip select 2 must be 1 and 0 for the unit to operate. Otherwise,
the data bus is said to be in a high-impedance state.
Memory Address Map
The interconnection between memory and processor is then established from
knowledge of the size of memory needed and the type of RAM and ROM chips
available.
The addressing of memory can be established by means of a table that specify the
memory address assigned to each chip.
The table called Memory address map, is a pictorial representation of assigned
address space for each chip in the system.
The component column specifies whether a RAM or a ROM chip is used. The hexadecimal address column assigns a range of
hexadecimal equivalent addresses for each chip.
The address bus lines are listed in the third column. Although there are 16 lines in the address bus, the table shows only 10 lines
because the other 6 are not used in this example and are assumed to be zero.
The small x's under the address bus lines designate those lines that must be connected to the address inputs in each chip. The RAM
chips have 128 bytes and need seven address lines.
The ROM chip has 512 bytes and needs 9 address lines. The x's are always assigned to the low-order bus lines: lines 1 through 7
for the RAM and lines 1 through 9 for the ROM.
It is now necessary to distinguish between four RAM chips by assigning to each a different address. For this particular example we
choose bus lines 8 and 9 to represent four distinct binary combinations.
Note that any other pair of unused bus lines can be chosen for this purpose. The table clearly shows that the nine low-order bus
lines constitute a memory space for RAM equal to 29 = 512 bytes.
The distinction between a RAM and ROM address is done with another bus line. Here we choose line 10 for this purpose. When
line 10 is 0, the CPU selects a RAM, and when this line is equal to 1, it selects the ROM.
The equivalent hexadecimal address for each chip is obtained from the information under the address bus assignment. The address
bus lines are subdivided into groups of four bits each so that each group can be represented with a hexadecimal digit.
The first hexadecimal digit represents lines 13 to 16 and is always 0. The next hexadecimal digit represents lines 9 to 12, but lines
11 and 12 are always 0.
The range of hexadecimal addresses for each component is determined from the x's associated with it. These x's represent a binary
number that can range from an all-0's to an all-1's value.
Memory Connection to CPU
RAM and ROM chips are connected to a CPU through the data and address buses.
The low-order lines in the address bus select the byte within the chips and other lines in the address bus select a
particular chip through its chip select inputs.
The connection of memory chips to the CPU is shown in Fig. 4. This configuration gives a memory capacity of 512
bytes of RAM and 512 bytes of ROM. It implements the memory map of Table 1. Each RAM receives the seven low-
order bits of the address bus to select one of 128 possible bytes.
The particular RAM chip selected is determined from lines 8 and 9 in the address bus. This is done through a 2 x 4
decoder whose outputs go to the CS1 inputs in each RAM chip.
Thus, when address lines 8 and 9 are equal to 00, the first RAM chip is selected. When 01, the second RAM chip is
selected, and so on. The RD and WR outputs from the microprocessor are applied to the inputs of each RAM chip.
The selection between RAM and ROM is achieved through bus line 10. The RAMs are selected when the bit in this
line is 0, and the ROM when the bit is 1.
The other chip select input in the ROM is connected to the RD control line for the ROM chip to be enabled only
during a read operation. Address bus lines 1 to 9 are applied to the input address of ROM without going through the
decoder.
This assigns addresses 0 to 511 to RAM and 512 to 1023 to ROM. The data bus of the ROM has only an output
capability, whereas the data bus connected to the RAMs can transfer information in both directions .
The example just shown gives an indication of the interconnection complexity that can exist between memory chips
and the CPU.
The more chips that are connected, the more external decoders are required for selection among the chips . The
designer must establish a memory map that assigns addresses to the various chips from which the required
connections are determined.
Auxiliary Memory
• It is where programs and data are kept for long-term storage or when not in
immediate use.
• The most common examples of auxiliary memories are magnetic tapes and
magnetic disks.
Magnetic Disks
A magnetic disk is a circular plate constructed of metal or plastic coated with magnetized material. Often both sides
of the disk are used and several disks may be stacked on one spindle with read/write heads available on each surface.
All disks rotate together at high speed and are not stopped or started for access purposes. Bits are stored in the
magnetized surface in spots along concentric circles called tracks.
The tracks are commonly divided into sections called sectors. In most systems, the minimum quantity of
information which can be transferred is a sector. The subdivision of one disk surface into tracks and sectors is shown
in Fig. 5.
Some units use a single read/write head for each disk surface. In this type of unit, the track address bits are used by a
mechanical assembly to move the head into the specified track position before reading or writing.
In other disk systems, separate read/write heads are provided for each track in each surface. The address bits can
then select a particular track electronically through a decoder circuit.
This type of unit is more expensive and is found only in very large computer systems. Permanent timing tracks are
used in disks to synchronize the bits and recognize the sectors.
A disk system is addressed by address bits that specify the disk number, the disk surface, the sector number and the
track within the sector. After the read/write heads are positioned in the specified track, the system has to wait until the
rotating disk reaches the specified sector under the read/write head.
A track in a given sector near the circumference is longer than a track near the center of
the disk If bits are recorded with equal density, some tracks will contain more recorded
bits than others.
To make all the records in a sector of equal length, some disks use a variable recording
density with higher density on tracks near the center than on tracks near the circumference.
This equalizes the number of bits on all tracks of a given sector.
Disks that are permanently attached to the unit assembly and cannot be removed by the
occasional user are called hard disks. A disk drive with removable disks is called a floppy
disk. The disks used with a floppy disk drive are small removable disks made of plastic
coated with magnetic recording material.
There are two sizes commonly used, with diameters of 5.25 and 3. 5 inches. The 3.5-inch
disks are smaller and can store more data than can the 5.25-inch disks.
Floppy disks are extensively used in personal computers as a medium for distributing
Magnetic Tape
A magnetic tape transport consists of the electrical, mechanical, and electronic components to
provide the parts and control mechanism for a magnetic-tape unit.
The tape itself is a strip of plastic coated with a magnetic recording medium. Bits are recorded as
magnetic spots on the tape along several tracks.
Usually, seven or nine bits are recorded simultaneously to form a character together with a parity bit.
Read/write heads are mounted one in each track so that data can be recorded and read as a sequence of
characters.
Magnetic tape units can be stopped, started to move forward or in reverse, or can be rewound.
However, they cannot be started or stopped fast enough between individual characters.
For this reason, information is recorded in blocks referred to as records. Gaps of unrecorded tape are
inserted between records where the tape can be stopped.
The tape starts moving while in a gap and attains its constant speed by the time it reaches the next
record. Each record on tape has an identification bit pattern at the beginning and end.
By reading the bit pattern at the beginning, the tape control identifies the record number.
By reading the bit pattern at the end of the record, the control recognizes the beginning of a gap.
A tape unit is addressed by specifying the record number and the number of characters in the record.
Records may be of fixed or variable length.
Associative Memory
• An associative memory can be considered as a memory unit whose stored data can
be identified for access by the content of the data itself rather than by an address
or memory location.
• On the other hand, when the word is to be read from an associative memory, the
content of the word, or part of the word, is specified. The words which match the
specified content are located by the memory and are marked for reading.
Hardware Organization
The block diagram of an associative memory is shown in Fig. 6. It consists of a memory array and
logic for m words with n bits per word. The argument register A and key register K each have n bits, one
for each bit of a word.
The match register M has m bits, one for each memory word. Each word in memory is compared in
parallel with the content of the argument register. The words that match the bits of the argument register
set a corresponding bit in the match register.
After the matching process, those bits in the match register that have been set indicate the fact that
their corresponding words have been matched.
Reading is accomplished by a sequential access to memory for those words whose corresponding bits
in the match register have been set.
The key register provides a mask for choosing a particular field or key in the argument word.
The entire argument is compared with each memory word if the key register contains all 1's.
Otherwise, only those bits in the argument that have 1's in their corresponding position of the key
register are compared.
The relation between the memory array and external registers in an associative memory
is shown in Fig. 7.
The cells present inside the memory array are marked by the letter C with two subscripts.
The first subscript gives the word number and the second specifies the bit position in the
word. For instance, the cell Cij is the cell for bit j in word i.
A bit Aj in the argument register is compared with all the bits in column j of the array
provided that Kj = 1. This process is done for all columns j = 1, 2, 3......, n.
If a match occurs between all the unmasked bits of the argument and the bits in word i, the
corresponding bit Mi in the match register is set to 1.
If one or more unmasked bits of the argument and the word do not match, Mi is cleared to
Cache Memory
The data or contents of the main memory that are used frequently by CPU are
stored in the cache memory so that the processor can easily access that data in a
shorter time.
Whenever the CPU needs to access memory, it first checks the cache memory.
If the data is not found in cache memory, then the CPU moves into the main
memory.
Cache memory is placed between the CPU and the main memory.
The block diagram for a cache memory can be represented as:
The main memory can store 32k words of 12 bits each. The cache is capable of storing
512 of these words at any given time. For every word stored , there is a duplicate copy
in main memory. The CPU communicates with both memories. It first sends a 15 bit
address to cache. If there is a hit, the CPU accepts the 12 bit data from cache. If there
is a miss, the CPU reads the word from main memory and the word is then transferred
to cache.
• When a read request is received from CPU, contents of a block of memory words
containing the location specified are transferred in to cache
• When the program references any of the locations in this block , the contents are read
from the cache Number of blocks in cache is smaller than number of blocks in main
memory
• Correspondence between main memory blocks and those in the cache is specified by
a mapping function
• Assume cache is full and memory word not in cache is referenced
• Control hardware decides which block from cache is to be removed to create space
for new block containing referenced word from memory
Read/ Write operations on cache
• Cache Hit Operation
• CPU issues Read/Write requests using addresses thatλ refer to locations in main
memory
• Cache control circuitry determines whether requested word currently exists in
cache
• If it does, Read/Write operation is performed on the appropriate location in cache
(Read/Write Hit )
Read/Write operations on cache in case of Hit
• In Read operation main memory is not involved.
• In Write operation two things can happen.
1. Cache and main memory locations are updated λ simultaneously (“ Write Through
”) OR
2. Update only cache location and mark it as “ Dirty or λ Modified Bit ” and update
main memory location at the time of cache block removal (“ Write Back ” or “ Copy
Read/Write operations on cache in case of Miss Read Operation
• When addressed word is not in cache Read Miss occurs there are two ways this
can be dealt with
1. Entire block of words that contain the requested word is copied from main
memory to cache and the particular word requested is forwarded to CPU from the
cache ( Load Through ) (OR)
2. The requested word from memory is sent to CPU first and then the cache is
updated ( Early Restart )
Write Operation
• If addressed word is not in cache Write Miss occurs
• If write through protocol is used information is directly written in to main memory
• In write back protocol , block containing the word is first brought in to cache , the
desired word is then overwritten.
Mapping Functions
• Correspondence between main memory blocks and those in the cache
is specified by a memory mapping function
1. Direct Mapping
2. Associative Mapping
3. Set Associative Mapping
Direct Mapping
Associative memories are expensive compared to random-access memories
because of the added logic associated with each cell
The possibility of using a random-access memory for the cache is investigated in
above figure.
The CPU address of 15 bits is divided into two fields. The nine least significant bits
constitute the index field and the remaining six bits form the tag field.
The figure shows that main memory needs an address that includes both the tag and
the index bits.
The number of bits in the index field is equal to the number of address bits
required to access the cache memory.
In the general case, there are 2k words in cache memory and 2n words in main
memory.
The n-bit memory address is divided into two fields: k bits for the index field and
n - k bits for the tag field.
The direct mapping cache organization uses the n-bit address to access the main
memory and the k-bit index to access the cache.
The internal organization of the words in the cache memory is as shown in
• Each word in cache consists of the data word and its associated tag.
• When a new word is first brought into the cache, the tag bits are stored alongside the data bits.
• When the CPU generates a memory request, the index field is used for the address to access the cache.
• The tag field of the CPU address is compared with the tag in the word read from the cache.
• If the two tags match, there is a hit and the desired data word is in cache.
• If there is no match, there is a miss and the required word is read from main memory. It is then stored in
the cache together with the new tag, replacing the previous value.
• The disadvantage of direct mapping is that the hit ratio can drop considerably if two or more words
whose addresses have the same index but different tags are accessed repeatedly.
• However, this possibility is minimized by the fact that such words are relatively far apart in the address
range (multiples of 512 locations in this example.)
• To see how the direct-mapping organization operates, consider the numerical example shown in Fig.
13. The word at address zero is presently stored in the cache (index = 000, tag = 00, data = 1220).
• Suppose that the CPU now wants to access the word at address 02000. The index address is 000, so it is
used to access the cache. The two tag_s are then compared.
• The cache tag is 00 but the address tag is 02, which does not produce a match. Therefore, the main
memory is accessed and the data word 5670 is transferred to the CPU.
• The cache word at index address 000 is then replaced with a tag of 02 and data of 5670.
• The direct-mapping example just described uses a block size of one word. The same organization but
using a block size of B words is shown in Fig. 14.
The index field is now divided into two parts: the block field and the word field.
The block number is specified with a 6-bit field and the word within the block is
specified with a 3-bit field.
The tag field stored within the cache is common to all eight words of the same block.
Every time a miss occurs, an entire block of eight words must be transferred from main
memory to cache memory.
Although this takes extra time, the hit ratio will most likely improve with a larger
block size because of the sequential nature of computer programs.
Set-Associative Mapping
In this method, blocks of cache are grouped into sets, and the mapping allows a block of main memory to
reside in any block of a specific set
• It was mentioned previously that the disadvantage of direct mapping is that two words with the same
index in their address but with different tag values cannot reside in cache memory at the same time.
• A third type of cache organization, called set-associative mapping, is an improvement over the direct
mapping organization in that each word of cache can store two or more words of memory under the same
index address.
• Each data word is stored together with its tag and the number of tag-data items in one word of cache is
said to form a set.
• An example of a set-associative cache organization for a set size of two is shown in Fig. 15.
• Each index address refers to two data words and their associated tags.
Each tag requires six bits and each data word has 12 bits, so the word length is 2(6 + 12) = 36 bits.
An index address of nine bits can accommodate 512 words.
Thus the size of cache memory is 512 x 36.
• It can accommodate 1024 words of main memory since each word of cache contains two data words.
• In general, a set-associative cache of set size k will accommodate k words of main memory in each word
The octal numbers listed in Fig. 15 are with reference to the main memory contents illustrated
The words stored at addresses 01000 and 02000 of main memory are stored in cache memory at index address 000.
Similarly, the words at addresses 02777 and 00777 are stored in cache at index address 777.
When the CPU generates a memory request, the index value of the address is used to access the cache. The tag field
of the CPU address is then compared with both tags in the cache to determine if a match occurs.
The comparison logic is done by an associative search of the tags in the set similar to an associative memory search:
thus the name "set-associative."
The hit ratio will improve as the set size increases because more words with the same index but
different tags can reside in cache.
However, an increase in the set size increases the number of bits in words of cache and requires
more complex comparison logic.
When a miss occurs in a set-associative cache and the set is full, it is necessary to replace one of
the tag-data items with a new value.
The most common replacement algorithms used are: random replacement, first-in, firstout
(FIFO), and least recently used (LRU).
With the random replacement policy the control chooses one tag-data item for replacement at
random. The FIFO procedure selects for replacement the item that has been in the set the longest.
The LRU algorithm selects for replacement the item that has been least recently used by the
CPU.