Charge Sharing Prob
Charge Sharing Prob
College of Engineering
Department of Electrical Engineering and Computer Sciences
a) Implement the logic shown below as a single complex, dynamic gate (with four
inputs) with a static inverter at the output. You should arrange the dynamic gate
such that the worst case drop in its output voltage due to charge sharing is
minimized.
Solution:
We place the transistors connected to input B and C at the bottom of the stack, so
that their source diffusion and gate capacitances never participate in charge
sharing with the output node because they are always connected to ground during
the evaluation phase.
b) What pattern of the inputs A, B, C and D results in the worst-case drop in the
dynamic gate’s output voltage due to charge sharing? Assuming that VDD = 1.2V,
CG = 2fF/μm, CD = 1.5fF/μm, that the NMOS pull-down network in the dynamic
gate is sized to have the same worst-case resistance (with long-channel transistors)
as the pre-charge PMOS transistor, and that the input capacitance of the inverter is
4 times that of the dynamic gate, what is this worst-case dynamic gate output
voltage?
Solution:
In the evaluation phase of the previous cycle, B (or C) and D were on and
discharged node N and P to 0. In the precharge phase of the current cycle, A is off
and keeps nodes N and P to 0, while node M gets charged to Vdd. Then in the
evaluation phase of the current cycle, if A = D = Vdd and B=C=0, we will have
the worst case charge sharing between node M and nodes N and P.
The sizing of the NMOS transistors in the gate to satisfy the same worst-case
resistance (with long-channel transistors) as the pre-charge PMOS transistor is
shown in the above schematic. The loading capacitance from the inverter to the
first stage is 𝐶𝐺 ∗ 4𝑊𝑁 .
To calculate the voltage drop on M, first we can guess that transistors A and D are
𝐶𝑀 𝑉𝑑𝑑 1.2∗10.25
still on after charge-sharing. Then, 𝑉𝑀 = 𝑉𝑁 = 𝑉𝑃 = 𝐶 +𝐶 = 10.25+5+6.5 =
𝑀 𝑁 +𝐶
𝑃
0.565𝑉
where,
𝐶𝑀 = 𝐶𝐷 (𝑊𝑁 + 0.5𝑊𝑁 ) + 𝐶𝐺 ∗ 4𝑊𝑁
= 1.5𝑓𝐹/𝜇𝑚 ∗ 1.5𝑊𝑁 + 2𝑓𝐹/𝜇𝑚 ∗ 4𝑊𝑁
= 10.25𝑓𝐹/𝜇𝑚 ∗ 𝑊𝑁
𝐶𝑁 = 𝐶𝐷 ∗ 2𝑊𝑁 + 𝐶𝐺 ∗ 𝑊𝑁 = 1.5𝑓𝐹/𝜇𝑚 ∗ 2𝑊𝑁 + 2𝑓𝐹/𝜇𝑚 ∗ 𝑊𝑁
= 5𝑓𝐹/𝜇𝑚 ∗ 𝑊𝑁
𝐶𝑃 = 𝐶𝐷 ∗ 3𝑊𝑁 + 𝐶𝐺 ∗ 𝑊𝑁 = 1.5𝑓𝐹/𝜇𝑚 ∗ 3𝑊𝑁 + 2𝑓𝐹/𝜇𝑚 ∗ 𝑊𝑁
= 6.5𝑓𝐹/𝜇𝑚 ∗ 𝑊𝑁
Consider the domino circuit above. Assume long-channel transistors, CL = 500fF, Cin =
4fF, CG = 2fF/μm, CD = 1.5fF/μm, and that input signal A is the last one to arrive.
a) Find the logical effort of each stage in the critical path for the evaluation edge
(rising edge of Out).
Solution:
We can get the LE of A by using LE=Cin,c/Cin,inv after sizing the reference
inverter to have the same resistance on the appropriate edge (note that two
reference inverters need to be used, one for each stage):
LE1 = 3/3=1
LE2 = (4+1)/(4+2)=5/6
Solution:
PE = (500/4) ⋅ (1 ⋅ 5/6) = 104.2
EF = (104.2)1/2 = 10.2
Now we can size the gates:
Cin,2 = 500 / EF * 5/6 = 40.84fF WN2 = 4.1um
Cin,1 = 4fF WN1 = 2um
c) Estimate the delay of the critical path in F04. Include the worst-case parasitic
delay terms. Recall that 1FO4 is equal to (4+γ)tinv.
Solution:
We remember that:
1FO4 = (4 + (1.5fF/um) / (2fF/um)) tinv= 4.75tinv
We can write the total delay of the chain as:
Delay = N⋅EF⋅tinv + Σ (pi)⋅tinv
where:
pi = LEi (Cint,i/Cin,i)
We already found the EF in part b), so now we only need to find the pi of each
stage.
Since A is the last data to arrive to the gate, the worst case parasitic delay (largest
pi) at stage 1 happens when B is always 0.
Therefore, p1 can be expressed as:
p1 = (3/3)* ( 2.5WN1CD)/(WN1CG) = 1.875
tp = 4.82 tF04
d) From the standpoint of minimum delay, is this the optimum number of stages? If
not, how many stages would you use to minimize the delay?
Solution:
We know from lecture that we’d like the electrical effort (i.e., capacitive
fanout) of our fastest gates to be 4. For the case of domino gates, we can
approximate the LE per stage as roughly √(2/3×5/6)=0.75. So an electrical
fanout of 4 translates into an Effective Fanout of 0.75*4 = 3.
The PE for the given chain of gates was computed as 104.2. Hence, we can
approximate the optimal number of stages for minimum delay = log3 (104.2) ≈
4.23 stages. Once we round to 4 stages this will work out to be an even
number of domino “gates” – in other words, we’ll have two dynamic stages
and two static stages (ordered as dynamic – static – dynamic – static).
Note that these calculations represent just an approximation, since the actual PE
of the chain will be different after adding the additional domino buffer, whose
logical effort is LE=2/3*5/6=0.55 (assuming a footed implementation).
Nevertheless, if we redo the calculations, we get: Nopt=log3(104.2*0.55)=3.68,
which we would still round up to 4 gates in order to have an even number of
stages, which are required in a domino chain. In conclusion, the approximate
approach still gives pretty accurate results, because Nopt is logarithmically
dependent on PE.