Unit 4 Vlsi
Unit 4 Vlsi
UNIT – IV
DESIGNING ARCHITECTURE BUILDING BLOCKS
Arithmetic Building Blocks: Data Paths, Adders, Multipliers, Shifters, ALUs, power and speedtradeoffs,
Case Study: Design as a tradeoff.
Designing Memory and Array structures: Memory Architectures and Building Blocks, Memory Core,
Memory Peripheral Circuitry.
Data path circuits are meant for passing the data from one segment to other segment
for processing or storing.
The data path is the core of processors, where all computations are performed.
It is generally defined with general digital processor. It is shown in figure.
In this, data is applied at one port and data output is obtained at second port.
Data path block consists of arithmetic operation, logical operation, shift operation and
temporary storage of operands.
UNIT-IV –EC6601 VLSI DESIGN
Data paths are arranged in a bit sliced organization.
Instead of operating on single bit digital signals, the data in a processor are arranged in a
word based fashion.
Bit slices are either identical or resemble a similar structure for all bits.
The data path consists of the number of bit slices (equal to the word length), each
operating on a single bit. Hence the term is bit-sliced.
Draw the structure of ripple carry adder and explain its operation. (Nov 2017)
Explain the operation of a basic 4 bit adder. (Nov 2016)
V
DD
V
DD C
i A B
A B
A
B
Ci B V
DD
A
X C
i
C
i A S
Ci
V
A B B DD
C
A B i A
Co B
Explain the operation and design of Carry lookahead adder (CLA). (May 2017, Nov 2016)
How the drawback in ripple carry adder overcome by carry look ahead adder and
discuss. (Nov 2017)
Explain the concept of carry lookahead adder and discuss its types. (April 2018)
A carry-lookahead adder (CLA) is a type of adder used in digital circuit.
A carry-lookahead adder improves speed by reducing the amount of time required to
determine carry bits.
In ripple carry adder, carry bit is calculated alongwith the sum bit.
Each bit must wait until the previous carry is calculated to begin calculating its own result and
carry bits.
The carry-lookahead adder calculates one or more carry bits before the sum, which reduces
the wait time to calculate the result of the larger value bits.
A ripple-carry adder works starting at the rightmost (LSB) digit position, the two
corresponding digits are added and a result obtained. There may be a carry out of this digit
position.
Accordingly all digit positions other than LSB. Need to take into account the possibility to
add an extra 1, from a carry that has come in from the next position to the right.
Carry lookahead depends on two things:
Calculating, for each digit position, whether that position is going to propagate a carry
if one comes in from the right.
Combining these calculated values to be able to realize quickly whether, for each
group of digits, that group is going to propagate a carry.
UNIT-IV –EC6601 VLSI DESIGN
Theory of operation:
Carry lookahead logic uses the concept of generating and propagating carry.
The addition of two 1-digit inputs A and B is said to generate if the addition will
carry, regardless of whether there is an input carry.
Generate:
In binary addition, A + B generates if and only if both A and B are 1.
If we write G(A,B) to represent the binary predicate that is true if and only if A + B
generates, we have:
G(A,B) = A . B
Propagate:
The addition of two 1-digit inputs A and B is said to propagate if the addition will carry
whenever there is an input carry.
In binary addition, A + B propagates if and only if at least one of A or B is 1.
If we write P(A,B) to represent the binary predicate that is true if and only if A + B
propagates, we have:
P(A,B) A B
These adders are used to overcome the latency which is introduced by the rippling effect
of carry bits.
Write carry look-ahead expressions in terms of the generate gi and propagate pi signals.
The general form of carry signal ci thus becomes
ci 1 ai .bi ci .( ai bi ) g i ci . pi
If ai .b =1, then ci 1 1, write generate term as, g i ai .bi
The Manchester carry chain is a variation of the carry-lookahead adder that uses shared
logic to lower the transistor count.
A Manchester carry chain generates the intermediate carries by tapping off nodes in the
gate that calculates the most significant carry value.
Dynamic logic can support shared logic, as transmission gate logic.
One of the major drawbacks of the Manchester carry chain is increase the propagation delay.
A Manchester-carry-chain section generally won't exceed 4 bits.
In this adder, the basic equation is ci 1 g i ci . pi
Where pi ai bi and g i ai .bi
Table
******************************************************************************
UNIT-IV –EC6601 VLSI DESIGN
V. HIGH SPEED ADDERS:
Design a carry bypass adder and discuss its features. (May 2016)
t
Sum Sum Sum sum Sum
M bits
Design a carry select adder and discuss its features. (May 2016)
Carry save adder is similar to the full adder. It is used when adding multiple numbers.
All the bits of a carry save adder work in parallel.
In carry save adder, the carry does not propagate. So, it is faster than carry propagate adder.
It has three inputs and produces 2 outputs, carry-out is saved. It is not immediately used to
find the final sum value.
Explain the design and operation of 4 x 4 multiplier circuit. (Apr. 2016, 2017, Nov 2016, 2018)
Design a multiplier for 5 bit by 3 bit. Explain its operation and summarize the numbers
of adders. Discuss it over Wallace multiplier. (Nov 2017, April 2018)
A study of computer arithmetic processes will reveal that the most common requirements
are for addition and subtraction.
There is also a significant need for a multiplication capability.
Basic operations in multiplication are given below.
0 x 0 = 0, 0 x 1 = 0, 1 x 0 = 0, 1x1=1
1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0
0 0 0 0 0 0 Partial products
1 0 1 0 1 0
1 1 1 0 0 1 1 1 0 Result
If two different 4-bit numbers (x0, x1, x2, x3 & y0, y1, y2, y3)are multiplied then
UNIT-IV –EC6601 VLSI DESIGN
Multiplication by shifting:
If x=(0020)2 = (2)10
If it is to be multiplied by 2, then we can shift x in left side. x = (0100)2 = (4)10
If it is to be divided by 2, then we can shift in right side. x = (0001)2 = (1)10.
So, shift register can be used for multiplication or division by 2.
X3 X2 X1 X0 Y0
X3 X2 X1 X0 Y 1 Z0
HA FA FA HA
X3 X2 X1 X0 Y2 Z1
FA FA FA HA
X3 X2 X1 X0 Y3 Z2
FA FA FA HA
Z Z Z Z Z
7 6 5 4 3
Figure: 4 x 4 array multiplier using Fulladder, Halfadder and AND gate.
But in booth multiplication, partial product generation is done based on recoding scheme
e.g. radix 2 encoding.
Bits of multiplicand (Y) are grouped from left to right and corresponding operation on
multiplier (X) is done in order to generate the partial product.
In radix-2 booth multiplication partial product generation is done based on encoding
which is as given by Table.
RADIX-2 PROCEDURE:
1) Add 0 to the LSB of the multiplier and make the pairing of 2 from the right to the
left which shown in the figure.
These group of binary digits are according to the Modified Booth Encoding Table and it
is one of the numbers from the set of (-2, 2, 0, 1, -1).
******************************************************************************
VII. DIVIDERS
There are two types of dividers, Serial divider and Parallel divider. Serial divider is
slow and parallel divider is fast in performance.
Generally division is done by repeated subtraction. If 10/3 is to be performed then,
10 -3 =7, ( divisor is 3, dividend is 10)
7–3=4,
4–3=1
Here, repeated subtraction has been done, after 3 subtractions, the remainder is 1. It is
less than divisor. So now the subtraction is stopped.
Let see the example of binary division with use of 1’s complement method
1010 (10d) / 0011 (3d)
Step1: find 1’s complement of divisor
Step2: add this with the dividend
Step3: if carry is 1, then it is added with the output to get the difference output
Step4: the same procedure is repeated until we are get carry 0.
Step5: then the process is stopped.
UNIT-IV –EC6601 VLSI DESIGN
101 0(10)
Y0 Y1 Y2 Y3 are complemented and given to 4 bit adder block (figure shown below)
X0 X1 X2 X3 are given to MUXs and MUX output is given to D flipflop. Select signal
of MUX is high. It is connected to clear input of counter.
Carry output of adder is connected with clock enable pin of counter. The same is given
to OR gate. The output of this OR gate is given to clock enable signal of flipflops.
The other input of OR gate is tied with select signal of MUX.
UNIT-IV –EC6601 VLSI DESIGN
If X > Y, C0 of adder is high.
After first subtraction, the counter output is incremented by 1.
For each subtraction, the counter output is incremented.
If C0 of adder is low, then clock of counter and FF is disabled. Counting is stopped.
Q3 Q2 Q1 Q0 is the counter output (Quotient)
R3 R2 R1 R0 is the flipflop output (remainder)
******************************************************************************
VIII. SHIFT REGISTERS:
Design 4 input and 4 output barrel shifter using NMOS logic. (NOV 2018).
An n-bit rotation is specified by using the control word R0-n and L/R bit defines a left or
right shifting.
For example y3 y 2 y 1 y 0 = a3 a2 a1 a0
If it is rotated 1-bit in left side, we get If y3 y 2 y 1 y 0 = a2 a1 a0 a3
it is rotated 1-bit in right side, we get y3 y 2 y 1 y 0 = a0 a3 a2 a1
UNIT-IV –EC6601 VLSI DESIGN
Barrel Shifter:
A barrel shifter is a digital circuit that can shift a data word by a specified number of bits in
one clock cycle.
It can be implemented as a sequence of multiplexers (MUX), and in such an implementation
the output of one MUX is connected to the input of the next MUX in a way that depends on
the shift distance.
For example, take a four-bit barrel shifter, with inputs A, B, C and D. The shifter can cycle
the order of the bits ABCD as DABC, CDAB, or BCDA; in this case, no bits are lost.
That is, it can shift all of the outputs up to three positions to the right (thus make any cyclic
combination of A, B, C and D).
The barrel shifter has a variety of applications, including being a useful component in
microprocessors (alongside the ALU).
Logarithmic Shifter:
th
A Shifter with a maximum shift width of M consists of a log2M stages, where the i stage
i
either shifts over 2 or passes the data unchanged.
Maximum shift value of seven bits is shown in figure, to shift over five bits, the first stage is
set to shift mode, the second to pass mode and the last again to shift.
The speed of the logarithmic shifter depends on the shift width in a logarithmic wa, M-bit
shifter requires log2M stages.
The series connection of pass transistors slows the shifter down for larger shift values.
Advantage of logarithmic shifter is more effective for larger shift values in terms of both area
and speed.
******************************************************************************
IX. SPEED AND AREA TRADE OFF:
Discuss the details about speed and area trade off. (May 2017)
Adder:
The tradeoff in terms of power and performance is shown below.
The performance is represented in terms of the delay(speed).
The area estimations for each of the delays are given based on the fact that area is in
relation to the power consumption.
The area of a carry lookahead adder is larger than the area of a ripple carry for a
particular delay.
UNIT-IV –EC6601 VLSI DESIGN
This is because the computations performed in a carry lookahead adder are parallel,
which requires a larger number of gates and also results in a larger area.
CLA –Carry Lookahead Adder, RC, R – Ripple carry adder
Figure: Area Vs Delay for 8 bit adder Figure: Area Vs Delay for 16 bit adder
Figure: Area Vs delay for 32 bit adder Figure: Area Vs delay for 64 bit adder
Figure: Delay Vs Area for all adders Figure: Area Vs Delay for all multiplier
Above figures shows that the delay of the ripple carry adder increases much faster
when compared to the carry lookahead adder as the number of bits is increased.
In the carry lookahead adder, the cost is in terms of the area because computations are
in parallel, and therefore more power is consumed for a specific delay.
Memory Architecture and Building Blocks:
Explain the memory architecture and its control circuits in detail. (April 2018)
When n x m memory is implemented, then, n memory words are arranged in a linear fashion.
One word will be selected at a time by using select line.
If we want to implement the memory 8X8, n=8, m=8(number of bits).
Then we need 8 select signals (one for each word).
But by using decoder we can reduce the number of select signals.
In case of 3 to 8 decoder, if 3 inputs are given to decoder, then we can get 8 select signals.
If n=220, then we can give only 20 inputs to the decoder.
If basic storage cell size is approximately square, then the design is extremely slow. The
vertical wire, which connects the storage cells to I/O will be excessively large.
So, memory arrays are organized in such a way that vertical and horizontal dimensions
are the same.
The words are stored in a row. These words are selected simultaneously.
The column decoder is used to route the correct word to the I/O terminals.
The row address is used to select one row of memory and column address is used to
select particular word from that selected row.
Word line: The horizontal select line which is used to select the single row of cell is
known as word line.
Bit line: The wire which connects the cell in a single column to the input/output circuit is
known as bit line.
Sense amplifier: It requires an amplification of the internal swin g to full rail-to-rail
amplitude.
Block address: the mem ory is divided into various small blocks.
The address which is used to select one of the small blocks to be read or written is known
as block address.
Advantages:
Access time is fast
Power saving is good, because blocks not activated are in power saving mode.
Programming ROM
The transistor in the intersection of row and column is OFF when the associated word
line is LOW. In this condition, we get logic 1 output.
2
EEPROM – E PROM:
Electrically Erasable Programmable ROM. Here Floating gate tunneling oxide
(FLOTOX) is used.
It is similar to floating gate except that the portion of the floating gate is separated from
the channel at the thickness of 10nm or <10nm.
If 10V is applied, electron trravels to and from the floating gate through F owler-Nordheim
tunneling.
Erasing can be done by revering applied voltage which is used for writing.
Figure: (a) Erase (b) Write (c) Read operation of NOR flash memory
UNIT-IV
4.9.3.3 RAM – Random Access Memory
Explain about static and dynamic RAM.
Construct 6T based SRAM cell. Explain its read and write operations. (NOV 2018)
UNIT-IV
Figure: Three transistor dynamic memory cell
To read the cell, the read word line (RWL) is raised. M2 transistor is either ON or OFF
depends upon the stored value.
BL2 bit line is connected to VDD or it is precharged to VDD or VDD-Vt.
When logic 1 is stored, the series combination of M2and M3 pulls BL2 line low.
If logic 0 is stored, then BL2 line is high.
To refresh the cell, first the stored data is read, and its inverse is placed on BL1 and WWL
line is asserted.
One transistor DRAM:
In this cell, to write logic 1 then it is placed on bit line and word line is asserted high.
The capacitor is charged or discharged depending upon the data. Before performing read
operation, bit line is precharged.
UNIT-IV
In the compare mode, stored data are compared using bit line. The match line is
connected to all CAM blocks in a row. And it is initially precharged to VDD.
If there is some match occurs, then internal row is discharged. If even one bit in a row is
mismatched, then the match line is low.
*****************************************************************************
Column Decoder
It should match the bit line pitch of the memory array.
In column decoder, decoder outputs are connected to nMOS pass transistors.
By using this circuit, we can selectively drive one out of m pass transistors.
UNIT-IV
Only one nMOS pass transistor is ON at the time.
Amplification:
In memory structures such as the 1T DRAM, amplification is required for proper
functionality.
Delay Reduction:
The amplifier compensates for the fan-out driving capability of the memory cell by
detecting and amplifying small transitions on the bit line to large signal output
swings.
Power reduction:
Reducing the signal swing on the bit lines can eliminate large part of the power
dissipation related to charging a n d discharging the bit lines.
(iii) Drivers/ Buffers
The length of word and bit lines increases with increasing memory sizes.
Large portion o f the read and write access time can be attributed t o the
wire delays.
A major part of the memory-periphery area is allocated to the drivers (address
buffers and I/O drivers).
******************************************************************************
4.11: Low Power Memory design:
Discuss about Low power memory design.
Figure: (a) Insertion of low threshold device (b) Reducing supply Voltage
******************************************************************************
UNIT-V EC8095-VLSI DESIGN