0% found this document useful (0 votes)
103 views35 pages

Unit 4 Vlsi

The document discusses arithmetic building blocks and data path circuits used in VLSI design. It describes: 1) Data path circuits pass data between processing segments. They contain arithmetic, logical, and shift operations along with temporary storage. Data paths are arranged in a bit-sliced organization to operate on word-based data. 2) Ripple carry adders calculate carries sequentially from least to most significant bits, causing delay that grows with bit number. Carry lookahead adders reduce delay by calculating carry bits in parallel using generate and propagate signals. 3) A 4-bit ripple carry adder is shown using AOI full adders with delay proportional to bit number. Carry lookahead adders

Uploaded by

hmpudur1968
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views35 pages

Unit 4 Vlsi

The document discusses arithmetic building blocks and data path circuits used in VLSI design. It describes: 1) Data path circuits pass data between processing segments. They contain arithmetic, logical, and shift operations along with temporary storage. Data paths are arranged in a bit-sliced organization to operate on word-based data. 2) Ripple carry adders calculate carries sequentially from least to most significant bits, causing delay that grows with bit number. Carry lookahead adders reduce delay by calculating carry bits in parallel using generate and propagate signals. 3) A 4-bit ripple carry adder is shown using AOI full adders with delay proportional to bit number. Carry lookahead adders

Uploaded by

hmpudur1968
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

UNIT-IV –EC 8095 VLSI DESIGN

UNIT – IV
DESIGNING ARCHITECTURE BUILDING BLOCKS

Arithmetic Building Blocks: Data Paths, Adders, Multipliers, Shifters, ALUs, power and speedtradeoffs,
Case Study: Design as a tradeoff.
Designing Memory and Array structures: Memory Architectures and Building Blocks, Memory Core,
Memory Peripheral Circuitry.

Design of Data path circuits:

Discuss about data path circuits.

 Data path circuits are meant for passing the data from one segment to other segment
 for processing or storing.
  The data path is the core of processors, where all computations are performed.
 It is generally defined with general digital processor. It is shown in figure.

Figure: General digital processor


 If only data path and its communication is shown as

 In this, data is applied at one port and data output is obtained at second port.

 Data path block consists of arithmetic operation, logical operation, shift operation and
temporary storage of operands.
UNIT-IV –EC6601 VLSI DESIGN
  Data paths are arranged in a bit sliced organization.
 Instead of operating on single bit digital signals, the data in a processor are arranged in a
 word based fashion.
  Bit slices are either identical or resemble a similar structure for all bits.
 The data path consists of the number of bit slices (equal to the word length), each
operating on a single bit. Hence the term is bit-sliced.

Figure: Bit-sliced datapath organization


******************************************************************************

II. Ripple Carry Adder:

  Draw the structure of ripple carry adder and explain its operation. (Nov 2017)
 Explain the operation of a basic 4 bit adder. (Nov 2016)

Architecture of Ripple Carry Adder:


  AOI Full adder circuit (AND OR INVERT)
 An AOI algorithm for static CMOS logic circuit can be obtained by using the equation.
Ci 1  ai bi  ci .(ai  bi )

S i  ( ai  bi  ci )ci  (ai .bi .ci )

Figure: AOI Full adder


UNIT-IV –EC6601 VLSI DESIGN
 If n bits are added, then we can get n-bit sum and carry of
 Cn. Ci= Carry bit from the previous column.
 N bit ripple carry adder needs n full adders with Ci+1 carry out bit.

Figure: Ripple carry adder


 The overall delay depends on the characteristics of full adder circuit. Different
CMOS implementation can produce different delay parts.
th
 tdi- worst case delay through the i stage. We can calculate the total delay using
the following equation
t4b = td3+td2+td1+td0 and td0 = td (a0, b0  c1)
  This is the time for the input to produce the carry out bit.
 td1 = td2 = td (cin  cout)
 td3 = td (cin  S3)
 t4b = td (cin  S3) +2td (cin  cout) + td (a0, b0  c1)
 If it is extend to n-bit, then the worst case delay is

 Worst case delay linear with the number of bits


td = O(N)
tadder = (N-1)tcarry + tsum
  The figure below shows 4-bit adder/subtractor circuit.
 In this, if add/sub=0, then sum is a+b. If add/sub=1, then the output is a-b.

Figure: 4-bit adder/subtractor circuit


 Sum and carry expressions are designed using static CMOS.
 It requires 28 transistors+̅(+ which+ lead) large area and circuit+ is slow+.

Sum, S= and Carry, C0=
Drawbacks:
 Circuit is slower.
UNIT-IV –EC6601 VLSI DESIGN
 In ripple carry adder, carry bit is calculated along with the sum bit. Each bit must wait
for calculation of previous carry.

V
DD
V
DD C
i A B
A B
A
B
Ci B V
DD
A
X C
i
C
i A S
Ci
V
A B B DD

C
A B i A
Co B

Figure: Complimentary Static CMOS Full Adder


******************************************************************************
III. Carry Look Ahead Adder (CLA):

 Explain the operation and design of Carry lookahead adder (CLA). (May 2017, Nov 2016)
 How the drawback in ripple carry adder overcome by carry look ahead adder and
 discuss. (Nov 2017)
 Explain the concept of carry lookahead adder and discuss its types. (April 2018)
 A carry-lookahead adder (CLA) is a type of adder used in digital circuit.
 A carry-lookahead adder improves speed by reducing the amount of time required to
 determine carry bits.
 In ripple carry adder, carry bit is calculated alongwith the sum bit.
 Each bit must wait until the previous carry is calculated to begin calculating its own result and
 carry bits.
 The carry-lookahead adder calculates one or more carry bits before the sum, which reduces
 the wait time to calculate the result of the larger value bits.
 A ripple-carry adder works starting at the rightmost (LSB) digit position, the two
corresponding digits are added and a result obtained. There may be a carry out of this digit
 position.
 Accordingly all digit positions other than LSB. Need to take into account the possibility to
 add an extra 1, from a carry that has come in from the next position to the right.
 Carry lookahead depends on two things:
 Calculating, for each digit position, whether that position is going to propagate a carry
 if one comes in from the right.
 Combining these calculated values to be able to realize quickly whether, for each
group of digits, that group is going to propagate a carry.
UNIT-IV –EC6601 VLSI DESIGN
 Theory of operation:
  Carry lookahead logic uses the concept of generating and propagating carry.
 The addition of two 1-digit inputs A and B is said to generate if the addition will
 carry, regardless of whether there is an input carry.
 Generate:
  In binary addition, A + B generates if and only if both A and B are 1.
 If we write G(A,B) to represent the binary predicate that is true if and only if A + B
generates, we have:
G(A,B) = A . B
 Propagate:
 The addition of two 1-digit inputs A and B is said to propagate if the addition will carry
 whenever there is an input carry.
  In binary addition, A + B propagates if and only if at least one of A or B is 1.
 If we write P(A,B) to represent the binary predicate that is true if and only if A + B
propagates, we have:
P(A,B)  A B
 These adders are used to overcome the latency which is introduced by the rippling effect
of carry bits.
 Write carry look-ahead expressions in terms of the generate gi and propagate pi signals.
The general form of carry signal ci thus becomes
ci 1  ai .bi  ci .( ai  bi )  g i  ci . pi
 If ai .b =1, then ci 1  1, write generate term as, g i  ai .bi

 Write the propagate term as, pi  ai bi



 Sum and carry expression are written as,
Si = ai  bi
c1=g0+p0.c0
c2=g1+p1.c1= g1+p1.( g0+p0.c0)
c3=g2+p2.c2
c4=g3+p3.c3 = g3+p3.g2+ p3.p2.g1+ p3.p2.p1.g0 + p3.p2.p1.p0.c0

Figure: Symbol and truth table of generate & propagate


UNIT-IV –EC6601 VLSI DESIGN

Figure – Logic network for 4-bit CLA carry bits

Figure – Sum calculation using the CLA network


 The symmetry in the array is shown in mirror. It allows more structured layout at
the physical design level.

Figure – MODL carry circuit


UNIT-IV –EC6601 VLSI DESIGN
  MODL-Multiple Output Domino Logic.
  MODL is non-inverting logic family and is a dynamic circuit technique.
  Its limitations are
i. Clocking in mandatory
ii. The output is subject to charge leakage and charge sharing.
iii. Series connected nFET chains can give long discharge times.
******************************************************************************

IV. Manchester Carry Chain Adder:

Discuss about Manchester Carry Chain Adder.

 The Manchester carry chain is a variation of the carry-lookahead adder that uses shared
 logic to lower the transistor count.
 A Manchester carry chain generates the intermediate carries by tapping off nodes in the
 gate that calculates the most significant carry value.
  Dynamic logic can support shared logic, as transmission gate logic.
  One of the major drawbacks of the Manchester carry chain is increase the propagation delay.
 A Manchester-carry-chain section generally won't exceed 4 bits.
 In this adder, the basic equation is ci 1  g i  ci . pi
Where pi  ai bi and g i  ai .bi

 Carry kill bit k i  ai  bi = ai .bi


 If Ki=1, then pi=0 and gi=0. Hence, ki is known as carry kill bit.

Table

Figure – switch level circuit

  In the circuit shown below Cl is used as an input if Pi = 0, then M3 is ON, M4 is OFF.


 If gi=0, then M1 is ON, M2 is ON
UNIT-IV –EC6601 VLSI DESIGN
 If gi=1, then M2 is OFF, M4 is ON and output equal to zero.
 If Pi=1, then this case is a complicated one.

  In dynamic circuit figure


  If ϕ = 0, then recharge occur and output is 1
 If ϕ = 1, then evaluation occur.

Figure dynamic circuit


 Dynamic Manchester carry chain for the carry bit upto C4 is shown below. C1, C2, C3,
C4 can be taken by using inverters. The carry input is given as C0

******************************************************************************
UNIT-IV –EC6601 VLSI DESIGN
V. HIGH SPEED ADDERS:

  Discuss about different types of high speed adders. (Apr. 2016)


 Describe the different approaches of improving the speed of the adder. (Nov 2016)



(i) Carry Skip(bypass) Adder:

Design a carry bypass adder and discuss its features. (May 2016)

  It is high speed adder. It consist of adder, AND gate and OR gate.


 An incoming carry Ci,0=1 propagates through the complete adder chain and an
 outgoing carry C0,3=1.
 In other words, if (P0P1P2P3 =1) then C0,3= Ci,0 else either DELETE or
 GENERATE occurred.
 It can be used to speed up the operation of the adder, as shown in below fig (b).

Figure: Carry Skip Adder.


  When BP= P0P1P2P3 =1, the incoming carry is forwarded immediately to the next block.
 Hence the name carry bypass adder or carry skip adder.
 Idea: if (P0 and P1 and P2 and P3 =1) the C03 = C0, else “kill” or “generate”.

Figure: (a) Carry propagation (b) Adding a bypass


UNIT-IV –EC6601 VLSI DESIGN
 The below figure shows n no. of bits carry skip adder.
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
t
Setup setup Setup t Setup Setup
bypass

Carry Carry Carry Carry


propagation propagation propagation propagation

t
Sum Sum Sum sum Sum

M bits

tadder = tsetup+Mtcarry+(N/M-1)tbypass +(M-1) tcarry +tsum (worst case)


tsetup: overhead time to create G, P, D signals

Figure: Manchester carry-chain implementation of bypass adder

(ii) Carry Select Adder:

Design a carry select adder and discuss its features. (May 2016)

 A carry-select adder is a particular way to implement an adder, which is a logic element


that computes the (n+1)-bit sum of two n-bit numbers.

  The carry-select adder is simple but rather fast, having a gate level depth of O( n ).
  The carry-select adder generally consists of two ripple carry adders and a multiplexer.
 Adding two n-bit numbers with a carry-select adder is done with two adders in order to
 perform the calculation twice.
 One time with the assumption of the carry-in being zero and the other assuming it will be
 one.
 After the two results are calculated (the correct sum as well as the correct carry-out), it is
 then selected with the multiplexer once the correct carry-in is known.
 The number of bits in each carry select block can be uniform, or variable.

 In the uniform case, the optimal delay occurs for a block size of n .

 The O( n ) delay is derived from uniform sizing, where the ideal number of full-adder
elements per block is equal to the√square2 root of the number of bits being added.

 Propagation delay, P is equal to where N = N- bit adder
  Below is the basic building block of a carry-select adder, where the block size is 4.
 Two 4-bit ripple carry adders are multiplexed together, where the resulting carry and
sum bits are selected by the carry-in.
UNIT-IV –EC6601 VLSI DESIGN


Figure: Building blocks of a carry-select adder


Uniform-sized adder:
 A 16-bit carry-select adder with a uniform block size of 4 can be created with three
 of these blocks and a 4-bit ripple carry adder.
 Since carry-in is known at the beginning of computation, a carry select block is
 not needed for the first four bits.
  The delay of this adder will be four full adder delays, plus three MUX delays.
 tadder = tsetup + Mtcarry + (N/M)tmux + tsum
UNIT-IV –EC6601 VLSI DESIGN

Figure: general structure of 16 bit adder


Disadvantage: hardware cost is increased.

(iii) Carry Save Adder:

 Carry save adder is similar to the full adder. It is used when adding multiple numbers.
 All the bits of a carry save adder work in parallel.
 In carry save adder, the carry does not propagate. So, it is faster than carry propagate adder.
 It has three inputs and produces 2 outputs, carry-out is saved. It is not immediately used to
find the final sum value.

Figure: Carry Save Adder


******************************************************************************
UNIT-IV –EC6601 VLSI DESIGN
V. ACCUMULATOR:
Briefly discuss about accumulators.

 Accumulator acts as a part of ALU and it is identified as register A. The result of


 an operation performed in the ALU is stored in the accumulator.
 It is used to hold the data for manipulation (arithmetic and logical)
 Arithmetic functions are very important in VLSI. Ex: multiplication.
 Half adder circuit has two inputs and two outputs. S = x  y , C = x.y.

Figure: Half adder and Truth table


 Full adder circuit has three inputs and two outputs

Figure : Full adder and truth table


CPL --- Complementary Pass Logic

Figure : CPL Full adder design


******************************************************************************
UNIT-IV –EC6601 VLSI DESIGN
VI. MULTIPLIERS:


 Explain the design and operation of 4 x 4 multiplier circuit. (Apr. 2016, 2017, Nov 2016, 2018)
 Design a multiplier for 5 bit by 3 bit. Explain its operation and summarize the numbers
of adders. Discuss it over Wallace multiplier. (Nov 2017, April 2018)



 A study of computer arithmetic processes will reveal that the most common requirements
 are for addition and subtraction.
  There is also a significant need for a multiplication capability.
 Basic operations in multiplication are given below.
0 x 0 = 0, 0 x 1 = 0, 1 x 0 = 0, 1x1=1

1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0

0 0 0 0 0 0 Partial products

 1 0 1 0 1 0

1 1 1 0 0 1 1 1 0 Result

 If two different 4-bit numbers (x0, x1, x2, x3 & y0, y1, y2, y3)are multiplied then
UNIT-IV –EC6601 VLSI DESIGN

Multiplication by shifting:
 If x=(0020)2 = (2)10
 If it is to be multiplied by 2, then we can shift x in left side. x = (0100)2 = (4)10
 If it is to be divided by 2, then we can shift in right side. x = (0001)2 = (1)10.
 So, shift register can be used for multiplication or division by 2.

 A practical implementation is based on the sequence. The product is obtained


 by successive addition and shift right operations
(i) Array multiplier:

Figure: General block diagram of multiplier


  Array multiplier uses an array of cells for calculation.
 Multiplier circuit is based on repeated addition and shifting procedure. Each partial
 product is generated by the multiplication of the multiplicand with one multiplier digit.
  The partial products are shifted according to their bit sequences and then added.
  N-1 adders are required where N is the number of multiplier bits.
 The method is simple but the delay is high and consumes large area by using ripple
carry adder for array multiplier. Product expression is given below
UNIT-IV –EC6601 VLSI DESIGN

Figure: 4 x 4 array multiplier


 This multiplier can accept all the inputs at the same time. An array multiplier for n-bit
2
word need n(n-2) full adders, n-half adder and n AND gates.

X3 X2 X1 X0 Y0

X3 X2 X1 X0 Y 1 Z0

HA FA FA HA

X3 X2 X1 X0 Y2 Z1

FA FA FA HA

X3 X2 X1 X0 Y3 Z2

FA FA FA HA

Z Z Z Z Z
7 6 5 4 3
Figure: 4 x 4 array multiplier using Fulladder, Halfadder and AND gate.

(iv) Booth (encoding) multiplier:


 Booth’s algorithm is an efficient hardware implementation of a digital circuit that
multiplies two binary numbers in two’s complement notation.

 Booth multiplication is a fastest technique that allows for smaller, faster multiplication
circuits, by recoding the numbers that are multiplied.
UNIT-IV –EC6601 VLSI DESIGN
 The Booths multipliers widely used in ASIC oriented products due to the higher
 computing speed and smaller area.
  In the binary number system, the digits called bits are to the set of {0,1}.
  The result of multiplying any binary number by binary bit is either 0 or original number.
  This makes the formation of partial products are more efficient and simple.
  Then adding all these partial products is time consuming task for any binary multipliers.
 The entire process consists of three steps partial product generation, partial product
reduction and addition of partial products as shown in figure.

Figure: Block diagram of Booth multiplier

 But in booth multiplication, partial product generation is done based on recoding scheme
 e.g. radix 2 encoding.
 Bits of multiplicand (Y) are grouped from left to right and corresponding operation on
 multiplier (X) is done in order to generate the partial product.
 In radix-2 booth multiplication partial product generation is done based on encoding
which is as given by Table.

Table: Booth encoding table with RADIX-2

  RADIX-2 PROCEDURE:
1) Add 0 to the LSB of the multiplier and make the pairing of 2 from the right to the
left which shown in the figure.

Figure: 2- Bit pairing as per Booth recoding using Radix- 2.


2) 00 and 11: do nothing according to the encoding table.
UNIT-IV –EC6601 VLSI DESIGN
3) 01: mark shows the end of the string’ of 1and add multiplicand to the partial product.
4) 10: mark shows beginnings of the string of 1 subtract multiplicand from
partial product.

Modified Booth Multiplier using Radix -4:


  The disadvantage of Booth Multiplier with Radix-2 is increasing partial products.
 Modified Booth Multiplier with Radix-4 is reducing the half of the partial products
 in multipliers.
 Modified Booth multiplication is a technique that allows for smaller, faster circuits by
 recoding the numbers that are multiplied.
 In Radix-4, encoding the multiplicands based on multipliers bits. It will compare 3-bits at
 a time with overlapping technique.
 Grouping starts from the LSB and the first block contains only two bits of the multipliers
and it assumes zero for the third bit.

Figure. Grouping of 3-bit as per booth recoding

 These group of binary digits are according to the Modified Booth Encoding Table and it
is one of the numbers from the set of (-2, 2, 0, 1, -1).

Table: Booth encoding table with RADIX-4


  RADIX-4 PROCEDURE:
1) Add 0 to the right of the LSB of the multiplier.
2) Extend the sign bit 1 position if it is necessary when n is even.
3) Value of each vector, the partial product is coming from the set of (-2, 2, 0, 1, -1).

(v) Wallace tree Multiplier:

 A Wallace tree is an efficient hardware implementation of a digital circuit that multiplies


 two integer numbers.
 The Wallace tree multiplier has three steps to be followed,
2
 (a) Multiply each bit of one of the arguments, by each bit of the other, yielding n results.
 (b) Reduce the number of partial products to two by layers of full and half adders.
(c) Group the wires in two numbers and add them with a conventional adder.
UNIT-IV –EC6601 VLSI DESIGN
  The second section works as follows,
(a) Take any three wires with same weights and input them into a full adder. The result
will be an output wire of the same weight and an output wire with a higher weight for
each three input wires.
(b) If there are two wires of the same weight left, input them into a half adder.
(c) If there is just one wire left and connects it to next layer.
 The Wallace tree multiplier output structure is tree basis style. It reduces the number of
 components and reduces the area.
 The architecture of a 4 x4 Wallace tree multiplier is shown in figure.

******************************************************************************
VII. DIVIDERS

Explain in detail about the design and procedure for dividers.

 There are two types of dividers, Serial divider and Parallel divider. Serial divider is
 slow and parallel divider is fast in performance.
 Generally division is done by repeated subtraction. If 10/3 is to be performed then,
10 -3 =7, ( divisor is 3, dividend is 10)
7–3=4,
4–3=1
 Here, repeated subtraction has been done, after 3 subtractions, the remainder is 1. It is
 less than divisor. So now the subtraction is stopped.
 Let see the example of binary division with use of 1’s complement method
1010 (10d) / 0011 (3d)
Step1: find 1’s complement of divisor
Step2: add this with the dividend
Step3: if carry is 1, then it is added with the output to get the difference output
Step4: the same procedure is repeated until we are get carry 0.
Step5: then the process is stopped.
UNIT-IV –EC6601 VLSI DESIGN
101 0(10)

  Basic building blocks of serial adder are given below.


1. 4 bit adder
2. 4 bit binary up counter
3. 2:1 MUX (4 MUXs are used)
4. D flipflop

 Y0 Y1 Y2 Y3 are complemented and given to 4 bit adder block (figure shown below)
 X0 X1 X2 X3 are given to MUXs and MUX output is given to D flipflop. Select signal
 of MUX is high. It is connected to clear input of counter.
 Carry output of adder is connected with clock enable pin of counter. The same is given
 to OR gate. The output of this OR gate is given to clock enable signal of flipflops.
 The other input of OR gate is tied with select signal of MUX.
UNIT-IV –EC6601 VLSI DESIGN
  If X > Y, C0 of adder is high.
  After first subtraction, the counter output is incremented by 1.
 For each subtraction, the counter output is incremented.
 If C0 of adder is low, then clock of counter and FF is disabled. Counting is stopped.
 Q3 Q2 Q1 Q0 is the counter output (Quotient)
 R3 R2 R1 R0 is the flipflop output (remainder)

******************************************************************************
VIII. SHIFT REGISTERS:

Design 4 input and 4 output barrel shifter using NMOS logic. (NOV 2018).

 An n-bit rotation is specified by using the control word R0-n and L/R bit defines a left or
right shifting.








 For example y3 y 2 y 1 y 0 = a3 a2 a1 a0
If it is rotated 1-bit in left side, we get If y3 y 2 y 1 y 0 = a2 a1 a0 a3
it is rotated 1-bit in right side, we get y3 y 2 y 1 y 0 = a0 a3 a2 a1
UNIT-IV –EC6601 VLSI DESIGN
Barrel Shifter:
 A barrel shifter is a digital circuit that can shift a data word by a specified number of bits in
 one clock cycle.
 It can be implemented as a sequence of multiplexers (MUX), and in such an implementation
the output of one MUX is connected to the input of the next MUX in a way that depends on
the shift distance.

 For example, take a four-bit barrel shifter, with inputs A, B, C and D. The shifter can cycle
 the order of the bits ABCD as DABC, CDAB, or BCDA; in this case, no bits are lost.
 That is, it can shift all of the outputs up to three positions to the right (thus make any cyclic
 combination of A, B, C and D).
 The barrel shifter has a variety of applications, including being a useful component in
microprocessors (alongside the ALU).

Figure: 8 X 4 barrel shifter


 General symbol for barrel shifter is shown in figure. The outputs are given as y3 y 2 y 1 y 0.
 S0, S1, S2, S3 are known as shift lines.
 A barrel shifter is often implemented as a cascade of parallel 2×1 multiplexers.
 For a 8-bit barrel shifter, two intermediate signals are used which shifts by four and two bits,
 or passes the same data, based on the value of S[2] and S[1].
 This signal is then shifted by another multiplexer, which is controlled by S[0].
 A common usage of a barrel shifter is in the hardware implementation of floating-point
arithmetic.

Figure: Barrel Shifter


UNIT-IV –EC6601 VLSI DESIGN
 For a floating-point add or subtract operation, requires shifting the smaller number to the
 right, increasing its exponent, until it matches the exponent of the larger number.
 This is done by using the barrel shifter to shift the smaller number to the right by the
 difference, in one cycle.
 If a simple shifter were used, shifting by n bit positions would require n clock cycles.
 The disadvantages of FET array barrel shifter are the threshold voltage drop problem,
 parasitic limited switching time problem.
 The figure shown is known as a barrel shifter and a 8 x 4-bit barrel shifter circuit.

Logarithmic Shifter:
th
 A Shifter with a maximum shift width of M consists of a log2M stages, where the i stage
i
 either shifts over 2 or passes the data unchanged.
 Maximum shift value of seven bits is shown in figure, to shift over five bits, the first stage is
 set to shift mode, the second to pass mode and the last again to shift.
 The speed of the logarithmic shifter depends on the shift width in a logarithmic wa, M-bit
 shifter requires log2M stages.
 The series connection of pass transistors slows the shifter down for larger shift values.
 Advantage of logarithmic shifter is more effective for larger shift values in terms of both area
and speed.

******************************************************************************
IX. SPEED AND AREA TRADE OFF:

Discuss the details about speed and area trade off. (May 2017)

Adder:
  The tradeoff in terms of power and performance is shown below.
  The performance is represented in terms of the delay(speed).
 The area estimations for each of the delays are given based on the fact that area is in
 relation to the power consumption.
 The area of a carry lookahead adder is larger than the area of a ripple carry for a
particular delay.
UNIT-IV –EC6601 VLSI DESIGN
 This is because the computations performed in a carry lookahead adder are parallel,
which requires a larger number of gates and also results in a larger area.
CLA –Carry Lookahead Adder, RC, R – Ripple carry adder

Figure: Area Vs Delay for 8 bit adder Figure: Area Vs Delay for 16 bit adder

Figure: Area Vs delay for 32 bit adder Figure: Area Vs delay for 64 bit adder

Figure: Delay Vs Area for all adders Figure: Area Vs Delay for all multiplier
 Above figures shows that the delay of the ripple carry adder increases much faster
 when compared to the carry lookahead adder as the number of bits is increased.
 In the carry lookahead adder, the cost is in terms of the area because computations are
in parallel, and therefore more power is consumed for a specific delay.
Memory Architecture and Building Blocks:

Explain the memory architecture and its control circuits in detail. (April 2018)
When n x m memory is implemented, then, n memory words are arranged in a linear fashion.
One word will be selected at a time by using select line.
If we want to implement the memory 8X8, n=8, m=8(number of bits).
Then we need 8 select signals (one for each word).
But by using decoder we can reduce the number of select signals.
In case of 3 to 8 decoder, if 3 inputs are given to decoder, then we can get 8 select signals.
If n=220, then we can give only 20 inputs to the decoder.

Figure: Array structured memory organization

 If basic storage cell size is approximately square, then the design is extremely slow. The
 vertical wire, which connects the storage cells to I/O will be excessively large.
 So, memory arrays are organized in such a way that vertical and horizontal dimensions
 are the same.
  The words are stored in a row. These words are selected simultaneously.
  The column decoder is used to route the correct word to the I/O terminals.
 The row address is used to select one row of memory and column address is used to
select particular word from that selected row.
 Word line: The horizontal select line which is used to select the single row of cell is
 known as word line.
 Bit line: The wire which connects the cell in a single column to the input/output circuit is
 known as bit line.
 Sense amplifier: It requires an amplification of the internal swin g to full rail-to-rail
 amplitude.
  Block address: the mem ory is divided into various small blocks.
 The address which is used to select one of the small blocks to be read or written is known
 as block address.
  Advantages:
Access time is fast
Power saving is good, because blocks not activated are in power saving mode.

Fi gure: Hierarchical memory architecture

4.9.3 Memory Core


Discuss Memory core its type s in detail.

4.9.3.1 Read Only Memory (R OM):


ROM is a memory where code is written only one time.
Diode ROM:
 It is simple where presence of diode in between bit line and word line is conssidered as
logic
 1 and absence of diode as logic 0.
 Disadvantage is used for small memories and no isolation between word line and bit line.

Figure: Diode ROM


MOS ROM:
 Diode is replaced by gate source connection of nMOS. Drain is connected to VDD.
 The charging and discharging of word line capacitance has been taken care by the word line
driver.

 Absence of a transistor between word line and bit line means logic 1 is stored and if presence
then logic 0 is stored.

Figure: 4 x 4OR ROM cell array Figure: 4 x 4 MOS NOR ROM

Programming ROM
 The transistor in the intersection of row and column is OFF when the associated word
line is LOW. In this condition, we get logic 1 output.

Figure: 4 x 4 MOS NAND ROM


Advantage: basic cell only consists of transistor. No need of connection to any of the supply
voltage.
Disadvantage: As it has pseudo nMOS, it is ratioed logic and consumes static power.

 To overcome this, precharged MOS NOR ROM logic circuit is used.


 This eliminate static dissipation ratioed logic requirement.
4.9.3.2 Non-Volatile READ-WRITE Memory:
 It consists of array of transistors. We can write the program by enabling or disabling these
devices
 selectively.
 To reprogram, the programmed values to be erased, then the new programming is started.

Floating gate transistor:


 It is mostly used in all the reprogrammable memories.
 In floating gate transistor, extra polysilicon strip is used in between the gate and the channel
known
 as floating gate.
 Floating gate doubles the gate oxides thickness and hence device transconductance is reduced
and
 threshold voltage is increased.
 The threshold voltage is a programmable.
 If high voltage is (>10V) is applied between the source terminals and gate-drain terminals,
then
 high electric field is generated. So, avalanche injection occurs.
 After acquiring energy, electron becomes hot and transverse through the first oxide insulator .
They
 get trapped on the floated gate.
 The floating gate transistor is known as floating gate avalanche injection MOS or
FAMOS. Disadvantage: High programming voltage is need.

Figure (a) floating gate transistor (b) symbol


EPROM – Erasable Programmable Read Only Memory:

 Erasing is done by passing UV rays on the cell by using transparent window.

 This process will take some seconds to some minutes.

 It depends on intensity of UV source. The programming takes 5-10microseconds/word.
 During programming, chip is removed from the board and placed in EPROM
programmer.
 Advantages: simple and large families are fabricated with low cost.
Disadvantages:

 Number of erase/program cycle is limited upto 1000.
 Reliability is not good.
Threshold voltage of the device may be varied with repeated program.

2
EEPROM – E PROM:
 Electrically Erasable Programmable ROM. Here Floating gate tunneling oxide
 (FLOTOX) is used.
 It is similar to floating gate except that the portion of the floating gate is separated from
the channel at the thickness of 10nm or <10nm.
 If 10V is applied, electron trravels to and from the floating gate through F owler-Nordheim
 tunneling.
 Erasing can be done by revering applied voltage which is used for writing.

Figure: FLOTOX transistor


Advantage: High versatility and possible for 105 erase/write cycle.
Disadvantages: Larger than FAMOS transistor, Costly, Repeated programming causes a drift in
threshold voltage.

Flash Memory – Flash Electric ally Erasable Programmable ROM



 It is a combination of density of EPROM and versatility of EEPROM.

 Avalanche hot electron injecti on mechanism is used.
 Erasing can be done by Fowle r-Nordheim tunneling concept. Here erasing is done in
bulk.

Figure: ETOX device



 It is similar to FAMOS gate.

 A very thin tunneling oxide la yer (10nm thickness) is there.
 Erasing operation: Erasing can be performed when gate is connected to the ground and the
 source is connected to 12V.
 Write operation: High voltage pulse is applied to the gate of the selected device. Logic 1 is
 applied to the drain and hot e lectrons are injected into the floating gate.
 Read operation: To select a cell, its word line is connected to 5V. It c auses conditional
discharge of the bit line.

Figure: (a) Erase (b) Write (c) Read operation of NOR flash memory

UNIT-IV
4.9.3.3 RAM – Random Access Memory
Explain about static and dynamic RAM.
Construct 6T based SRAM cell. Explain its read and write operations. (NOV 2018)

4.9.3.3.1 Static RAM:



 SRAM cell needs 6 transistors per bit.

 M5 and M6 transistors are shared between read and write operations.
 Bit line (BL) and inverse Bit Line signals are used to improve the noise margin during
read
and write operations.
Read operation:
 Let us assume logic 1 is stored at Q and BL and inverse BL are precharge to 2.5V before
starting read operation.
 The read cycle is started by asserting word line then M5 and M6 transistors are enabled.
 After the small initial word line delay then the values stored at Q and inverse Q are transferred to
the bit lines by leaving BL at 2.5V and the value at inverse Q is discharge through M1, M5.

Figure: CMOS SRAM cell


Write operation:
 Assume that Q=1, now logical 0 is to be written in the cell.
 Then inverse BL is set to 1 and BL is set to 0.
 The gate of M1 is at VDD and gate of M4 is at ground as long as the switching is not
 commenced.
 Inverse Q is not pulled high enough to ensure the writing of logic 1.
 Cell voltage is kept below 0.4V. The new value of the cell is written through M6.

4.9.3.3.2 Dynamic RAM:


Three transistors DRAM
 Content in the cell can be periodically rewritten through a resistive load, called as refresh
 operation.

 This refresh occurs for every 1-4ms. Dynamic memory has refresh operation.
 For example, logic 1 is to be written, and then BL1 is asserted high and write word line
(WWL)
 is asserted.
 This data is retained as charge on the capacitor once WWL is low.

UNIT-IV
Figure: Three transistor dynamic memory cell
 To read the cell, the read word line (RWL) is raised. M2 transistor is either ON or OFF
depends upon the stored value.
 BL2 bit line is connected to VDD or it is precharged to VDD or VDD-Vt.
 When logic 1 is stored, the series combination of M2and M3 pulls BL2 line low.
 If logic 0 is stored, then BL2 line is high.
 To refresh the cell, first the stored data is read, and its inverse is placed on BL1 and WWL
line is asserted.
One transistor DRAM:
 In this cell, to write logic 1 then it is placed on bit line and word line is asserted high.
 The capacitor is charged or discharged depending upon the data. Before performing read
operation, bit line is precharged.

Figure: One transistor DRAM

4.9.3.3.3 CAM – Content Addressable or Associate Memory

Explain about CAM.

  It supports 3 operating modes,


 Read
 Write
  Match
 In this memory, it is possible to compare all the stored data in parallel with the incoming
 data. It is not power efficient.
 Figure shows a possible implementation of a CAM array.
 The cell combines a traditional 6T RAM storage cell (M4-M9) with additional
circuitry to perform a l-bit digital comparison (M1-M3).
 When the cell is to be written, complementary data is forced onto the bit lines, while the
word line is enabled a s in a standard SRAM cell .

UNIT-IV
 In the compare mode, stored data are compared using bit line. The match line is
 connected to all CAM blocks in a row. And it is initially precharged to VDD.
 If there is some match occurs, then internal row is discharged. If even one bit in a row is
mismatched, then the match line is low.

Figure: CAM cell

*****************************************************************************

4.10 Memory peripheral (control) Circuits:


Explain the memory architecture and its control circuits in detail. (April 2018)
(i) Address & Block Decoders:
Row Decoder:
 Row and column address decoder are used to select the particular memory location in an
array.
n
  Row decoder is used to drive NOR ROM array. It selects one of 2 word lines.
 Dynamic 2 to 4 decoder reduces the number of transistors and propagation delay.

Symbol and Truth table Dynamic 2-to-4 NOR decoder

Column Decoder
  It should match the bit line pitch of the memory array.
  In column decoder, decoder outputs are connected to nMOS pass transistors.
 By using this circuit, we can selectively drive one out of m pass transistors.

UNIT-IV
 Only one nMOS pass transistor is ON at the time.

Figure: Four-input pass-transistor-based column decoder using a NOR predecoder

(ii) Sense Amplifier


 Sense amplifiers play a major role in the functionality, performance and
reliability of memory circuits.
  Basic differential sense amplifier circuit shown in below figure.
 It performs the following performances

Amplification:
 In memory structures such as the 1T DRAM, amplification is required for proper
functionality.
Delay Reduction:

The amplifier compensates for the fan-out driving capability of the memory cell by
detecting and amplifying small transitions on the bit line to large signal output
swings.
Power reduction:
 Reducing the signal swing on the bit lines can eliminate large part of the power
 dissipation related to charging a n d discharging the bit lines.
(iii) Drivers/ Buffers
 The length of word and bit lines increases with increasing memory sizes.
 Large portion o f the read and write access time can be attributed t o the
 wire delays.
 A major part of the memory-periphery area is allocated to the drivers (address
buffers and I/O drivers).

******************************************************************************
4.11: Low Power Memory design:
Discuss about Low power memory design.

(i) Active Power Reduction:


 Voltage reduction done by either an increase in the size of the storage
capacitor and/or a noise reduction.
Techniques for power reductions:
• Half- VDD precharge:
 Precharging a bit line to VDD/2. It helps to reduce the active power
 dissipation in DRAM memories by a factor of 2.
• Boosted word line:
 Raising the value of the word line above VDD during a write operation, eliminates
the threshold drop over the access transistor, yielding a substantial increase
in stored charge.
• Increased capacitor area or value:
 Keeping the "ground" plate of the storage capacitor at VDD/2 reduces the
maximum voltage over Cs, making it possible to use thinner oxides.
• Increasing the cell size:
 Ultra-low-voltage DRAM memory operation might require a sacrifice in area
 efficiency.
Retention current Reduction:
 SRAM array should not have any static power dissipation. But the leakage current of the
 transistor will be the major problem and this is the main source of the retention current.
 This retention current can be reduced by the following factors.
1. Turn OFF unused memory blocks
2. Negative biasing voltage of the cells which are not active, thus reduce the leakage current.
3. If low threshold voltage transistor is inserted between VDD and SRAM array, leakage
reduces.
4. Leakage is a function of VDD, thus if supply rail is lowered, then leakage current is
reduced.

Figure: (a) Insertion of low threshold device (b) Reducing supply Voltage
******************************************************************************
UNIT-V EC8095-VLSI DESIGN

UNIT V - IMPLEMENTATION STRATEGIES


FPGA Building Block Architectures, FPGA Interconnect Routing Procedures. Design for Testability:
Ad Hoc Testing, Scan Design, BIST, IDDQ Testing, Design for Manufacturability, Boundary Scan.


  Explain the reprogrammable device architecture with neat diagrams. 
 With neat diagram explain the functional blocks in PDA (Programmable
 Device Architecture). (AU:June 2015, June 2016)
 With neat sketch explain the CLB, IOB and Programmable interconnects of
 an FPGA device. (May 2016)
 Explain about building block architecture of FPGA. (April 2017, 2018, NOV 2018) 

Re-Programmable Devices Architecture (FPGA)

  FPGA provide the next generation in the programmable logic devices. 


 It refers to the ability of the gate arrays to be programmed for a specific function by the user. 

 The word Array is used to indicate a series of columns and rows of gates that can be
 programmed by the end user.
  As compared to standard gate arrays, the field programmable gate arrays are larger devices.
 The basic cell structure for FPGA is complicated than the basic cell structure of standard
 gate array.
  The programmable logic blocks of FPGA are called Configurable Logic Block (CLB). 
  The FPGA architecture consists of three types of configurable elements-
(i) IOBs –Input/output blocks
(ii) CLBs- Configurable logic blocks
(iii) Resources for interconnection
 The IOBs provide a programmable interface between the internal, array of logic blocks
(CLBs) and the device‟s external package p ins.
  CLBs perform user-specified logic functions.
  The interconnect resources carry signals among the blocks. 
  A configurable program stored in internal static memory cells. 
  Configurable program determines the logic functions and the interconnections. 
  The configurable data is loaded into the device during power-up reprogramming function.
 FPGA devices are customized by loading configuration data into internal memory cells. 

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy