0% found this document useful (0 votes)
38 views26 pages

Print Pages Deleted

This is my research
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
38 views26 pages

Print Pages Deleted

This is my research
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 26
Contents Executive Summary... Introduction. ASCUMpHIORS eee Research Phase. ‘Adder Architectures Comparison... Gate Level Implementation of the Full Adder... Logic Families Comparison forXOR and NAND of full adder. Gates Implementation of Half Adder... -ccconnon ‘Simulation Phase. ‘Optimization of the logic gate8 een nnn ne) DPL XOR gate... on ncn ‘CPL NAND Gate... cont cnn (CMOS NAND Gate... one cence ‘TG AND... Inverter Gate nse ‘Testing Circuit Logic Output .2-cneceennnenneon Evaluation Phase 0.0 Measuring worst case delay ‘Calculating number of transistors one Measuring power consumption eer nnem Re-evaluation Phase. Executive Summary ‘The project is to design a 4-bit digital adder, while taking care of performance parameters: area, speed and power consumption, the team has chosen to design according to the cost function: Area* Delay*Power. ‘The project is implemented in three phases: research phase, simulation phase, and evaluation/re- evaluation phase. “The adder circuit implemented as Ripple-Carry Adder (RCA), the team added improvements to overcome the disadvantages of the RCA architecture, for instance the first 1-bit adder is a Half ‘Adder, which is faster and more power-etfficient, the team was also carefully choosing the gates, to match the stated eost function. Gates are implemented using different logic families, according to each gate usage and functionality in the eircuit in order to achieve the desired performance “Transistor sizes are alsa selected based upon simulation and optimization, to reach the needed performance according to the specified cost function ‘The team was able to reach a 4-bit ripple carry adder that has delay of 1.22 ns with 0.6 uW power consumption (measured at 10 MHz), with 109:ransistors. In the re-evaluation phase, the team was able to further improve this to reach 0.99 ns delay with 0.25 uW power consumption (10 MHz) with 97 transistors only Introduction ‘The topic of the course project is to design a 4-bit adder in the standard 025 um CMOS ‘Technalogy. The main objectives of the project is ta minimize the total delay of the adder (i.e the worst case delay of the circuit), the area used to implement the adder. and its average power consumption. That in mind, the team was able to split the project into 2 phases: the research phase and the simulation phase. In the research phase, the team had to compare different adder architectures clearly defining the advantages and disadvantages of each one in terms of area and delay to be able to choose what could be the most efficient adder architecture for the design of a 4-bit adder. Another essential task in the research phase was to decide on the gate level implementation of the circuit, compare the different logic families’implementations for each gate, and finally decide on the proper logic family implementation for each gate in light of the project objectives stated beforehand ‘Once the research phase was accomplished, the team had to move on to the simulation phase. In the simulation phase, the team had to design each gate separately and optimize it to achieve the optimum delay and powerconsumption.thensimulate a I-bit full adder, and finally simulate the whole 4-bit adder. The simulation phase concludes the praject by estimating the worst case delay of the 4-bit adder design and the average power consumption of the circutt. Assumptions Design Criteria ‘The group members are not designing this adder for a very specific application that dictates certain design criteria or puts different weights on timedelay' or circuit area, that is why group members assumed it is better to implement a design that balances between time delay, power consumption and area used in the implementation of the 4-bit adder without giving different ‘weights to any of the design criteria. Therefore, the design criteria will be [A°(T*2)*(P*2)] (T time delay, A: area, P: power) not T°2*A or T?A*2 Half Adder ‘As the project description is to design a4 bit adder, group members assumed they have § inputs which are the 2 sets of 4 bits to be added, so in the design it is more efficient in terms of delay, area, and power to design a half bit adder for the first bit adder as there is na carry-in bit for the first adder. This will shaw great performance impravement because the C. bit will be result of 2 gate delays instead of 3. Research Phase “Research is formalized curiosity.” In this section, the team presents the results of the research phase which was an integral part of the project. Research phase was divided into 3 sub phases adder architectures comparison, gate level implementation of the chosen architecture, and lagic families’ comparison for gates of the chosen gate level implementation. The results the team came up with from each sub phase is of paramount importance for the 4-bit adder design. Adder Architectures Comparison In this section, a short description of the adder architecture and the exact time delay (T) and area (A) complexity based on unit gate model is presented. In the unit gate model each gate has a gule-count of one and a gate-delay of one exeluding XOR and XNOR gates having gate counts and gate delays of two, while the gates with more than 2 inputs, the gate-counts and gate-delays can be computed in terms of the ones given far the gates with two inputs; also, inverters and buffers are ignored. Ripple Cary Adder (RCA)is the simplest carry-propagate adder Its time delay and area complexity are as followsforan n-bit RCA adder: T=2n A=Tn+2 Carry Skip Adder (CSK.A) is the concatenation scheme with a carry-skip scheme. Its time delay and area complexity are as follows for an n-bit CSKA adder: Carry Select Adder (CSLA) is the concatenation scheme with a selection scheme. Its time delay and area complexity are as follows far an n-bit CSLA adder: K=1/2*(8n-7)"—% T=2k+2 A= 14n-Sk-5 Carry Look Abead Adder (CLA) uses direct parallel-prefix scheme for carry computation, Its time delay and area complexity are as follows for an n-bit CLA adder: T=2 login) +4 A= 3/2*ntlog(n) + 4n +5 Results of the comparison ean be clearly summarized in the following tables K x Tome SKA eyes arenes SLA Osr(8a— TOSOS Tan-5k-3 CLA 2logiayt TSratlog( aye Equations fur Time delay (T) and Area (A) complexity of each architecture. . RCA ‘SKA SLA i a 2[a[ ie ey at af ie [ [ani | s2e3 | isao | eos |e 48 2w0 | a.732 | e928 | 36.39 | 282.1 | 3.036 | 8071 | 35.92 | 289.1 | 8 3 [ie san | 2606 | 1058 | 73.47 | 7318 | 4a 32.81 | 9669 | 10 Ts | 32 Seas [3.973 | 15.09 | vas2 | 2350 | 7278 a6 [3023.12 22 [oe [err 22.27 [ows.« [6312 [in 3897 [ 9036 [a Different architectures delay (T) and area (A) for different nuruber of bits (a), 36000 4000 32000 30000 AA CHA 6000 ee seal 000 2000 GGraphie representation of the results in previous tbl K 7 x TA Tapa, Tana = 30, 200. T9270. 7200 z z a6 368 20H Teo 2 6 a0 246) 1476 1008s CLA 5 Es 26 22 8712 Equations of Time delay (T) and Area (A) complexity for each architecture when n= and rating their performance ‘ir different design criteria (T*A, T"2*A and T*A/2) 18000 116000 14000 4 12000 0000 ‘2000 ‘6000 4000 2000 0 RCA CORA CLA LA ‘TPA for different adder architectures (4b ‘A clear conclusion is that for small n-bit adders and design criteria balancing between area and time delay or giving more weight to area, the ripple camry isa better architecture, while for higher mbit adders carry skip, carry select or carry look ahead might be a better choice for the designer Since in this project, the team is designing a 4-bit adder and assuming same weights for area and delay, the team concluded that the ripple carry could be the mast efficient implementation for the 4-bit adder design Se Schematic 1.1: Ripple Carry adder schematic adders level ‘The ripple carry is prabably the simplest architecture for an adder. In this architecture the delay simply propagates from one Full-adder to the next one, therefore the implementation of the full adder is all that matters in its design. Gate Level Implementation of the Pull Adder In this section, a description of the gate level implementation of the 4-bit ripple carry adder is presented. After the group agreed on implementing ripple carry adder, it was crucial to research What available gate level implementation are there for the full adder, and mainly 3 implementations were compared. Implementation 1 uses only NAND gates to implement the logic of the full adder, Implementation 2 uses 2 XOR gates and 3 NAND to implement the logic. Implementation 3 uses 2 XOR, 2 AND and 1 OR to implement the logie. ‘Schematic 121 Gate level implementation Vat the full alder ee) _>——__—__ == Cot ‘Schematic 122: Gate level implementation of the full adder Schematic 123" Gate level implementation 3 ofthe fal adder Comparing these different implementations in terms of area and delay, it was clear that implemeniatien 1 will be too slow and takes too much area, while the other 2 implementations do not differ too much. However, as NAND gates can be implemented using CMOS logic family without the need for an inverter at the output, while AND and OR cannot, the team decided to choose implementation 2 to have the option of using CMOS logic whenever it is needed without the need to use inverters, ‘Schematic 124: Comparing implementation and 3 ofthe fall adder ‘Therefore, the final implementation of the full adder in this project is as follows: Sum = A XOR B XOR C Carry out = (A NAND B) NAND [(A XOR B) NAND Cin} ‘The next step will be to decide the logic family implementation for each gate, We Families Comparison for NOR and NAND of tull adder In this section, a description for the different logic families to implement XOR and NAND gates of the full adder gate level implementation that was agreed upon in the previous section ‘XOR gate has mainly 3 implementations: Complementary Pass-transistor Logic XOR A — (CPL) Se xOR “The main advantage of the CPL. logic family is that it uses few numbers of transistors so - in terms of area it has an edge over other _ implementations. However, CPL has a a reduced swing so it cannot be used as the e —_ ouput of any adder since according to project description, reduced swing at the ‘Schematic: XOR CPL implementation output is unacceptable, Double Pass-transistor LogicXOR(DPL) 5 XOR ‘The main advantages of the DPL logic ats family is that its delay is law since always 2 B transistors are ON in any charging or = discharging input combinations.Also it has A an advantage over the CPL that it has a full as swing at the output and uses a reasonable z xr B number of gates. Schematic: XOR DPL. implementation wee ‘Transmission gateXOR(TG) ‘Transmission gate is another implementation for the XOR function. However, its worst 4 Ly case delay is probably higher that the DPL. . iy since when A is HIGH only 1 transistor is charging or discharging the output compared LK to two in the DPL implementation. So in terms of delay DPL has an edge over transmission gate However, it uses less. number of transistors than DPL. Schematic: XOR Transmission gate implementation on NAND gate has mainly 3 implementations: CPLNAND ‘NAND DPL ol 2 >| a4 a C NAND +Voo —#Vo0 ‘Schematic: NANDCPL implementanon ‘Schematic: NANDDPL implementation CMOS NAND A 4 b 8 CMOS logic family has an advantage over DPL that it uses less number of transistors G (no need for inverters), and has an edge over CPL that its output is full swing ‘a 4 Schematic: NANDCMOS implementation After comparing the different logic families in the different logic families in terms of swing, delay. and area, the team made some educated assumptions. First of all for the XOR, the CPL was exeluded since the projeet deseription a full swing output. So, comparing DPI and transmission gate, the group assumed it is more efficient to use DPLXORas XOR gatesmust be very fast since it is on the track of propagation of the delay ‘As for the NAND gate, it is also important for it to be fast but still we need the output to be full swing, so the team assumed it is more efficient to implement 2 CPL NAND gates which outputs are input to a CMOS NAND to ensure full swing at the output (thanks to its Pull-Up Network). Another reason for choosing CMOS NAND to calculate the carry-out is that it uses only 4 transistors compared to DPL that needs & because it requires inverters at the inputs. “The follawing schematic shows the logic family of each gate in the project. ‘Schematic: Logic Family for each gate ‘This schematic also ensures there will be no more than 2 transistors in series as CPL requires: inverters so input is buffered, and CMOS NAND acts as a buffer since its inputs are to the gate of the transistors. We took the advantage of the CMOS. NAND following this reduced swing CPL NAND to return output to full swing. But we had to take care of the Short Circuit Power dissipation! However a threat was that CPL could cause statie power due to its reduced swing and since itis driving a CMOS NAND gate. So, the group needed to prove during simulation that this reduced swing will not cause a static power dissipation phenomenon, whieh is due to the situation that the reduced swing may lead to the PMOS devices in the CMOS NAND to be ON, while NMOS devices are ON as well, so a short circuit current can find its path from Vdd to ground causing power dissipation, ‘Testing actually showed that that the value of the output swing of the CPL ranges from 0 to 2.1 V (ieVthn is around 0.4V), and then Vthp was found to be around 0.6V, therefore no static power will take place which ensures that there will be no short circuit power dissipation as the PMOS device will not be on by the reduced input swing output (Vdd-Vthn) of the CPL. ‘Schematic: PMOS device tested to measure Vth,, ates Implementation of Half Adder As previously stated, we assume inputs are 2 sets of 4 bits, so it is more efficient to implement a half adder for the first bitas there is na input-carry A at hm —s ‘Schematic of Half Adder The half adder consists of a XOR and an AND gate. So based on previous analysis, the team agreed to use DPL family for XOR, and also use transmission gate(TG) for the AND gate since the AND is on the carry propagation path and transmission is probably the fastest logic implementation for the AND gate and uses only 3 transistorsand 1 inverter, and outputs a full swing, ‘SchematieTG-AND implementation “Multisim Scheme for the eircut before substituting the first full der with a half adder Simulation Phase “Simplicity is the ultimate sophistication” In this section, the team presents results of the simulation for the 4-bitadder. Since optimization is a very complex task as delay, area, and power are all affected whenever size of transistors are changed, the team decided to simply design each gate separately first to ensure the logic is correct, then optimize it to find the size that gives the lowest worst case delay and lowest power consumption for each gate. Then, concatenate gates together to form the 1-bit full adder and 1-bit half adder, before actually implementing the whole 4-bit adder and estimating the worst case delay and the average power consumption of the adder. As the project objective is to balance area, power and delay, and sinee the group has chasen Ripple Carry Adder Architecture that has an ed ge over other architectures that it requires less area, the group decided that during optimization they will give higher weight to time delay and power over area to censure this balance(because it is know that RCA is disadvantageous when it comes ta speed), Optimization of the logic gates Optimization is finding optimum values of transistor W/L that would achieve the desired performance balance between area, power and delay Note: during optimization and simulation of individual gates, a 40F capacitor was put at the output terminal (C) and frequency pulse used was 10 MHz. Also, higher weights was given to power and delay as the team’s decision to use ripple carry adder gives the adder an edge in terms of area, so it can give away part of this advantage to ensure low delay and power consumption. However, in the ‘overall balance is the ultimate goal of the des DPL NOR gate The following graphs show the de: no worst delay as always an NMOS and a PMOS are ON and the optimization of the DPL XOR gate. In DPL, there is ‘As(T79)*(P2) 1 1322595056 2 76596.98 3 78004.6875 4 937829376 3 105386.162 6 123312.5376 7 107 1417649527 3 To 1 169504 88 TpHL vs W/L Pvs W/L. 00 1 a S metrics |S stu 2 oe CPL NAND Gate The CPL NAND gate worst case delay is TpILH since it has a reduced swit following table and graphs present the design and results of the optimization. ‘AQw/te | Tptrips) [Pru] PTZ) (Pray a] 025/025 7 19 358372808 2 05/035 aaa] aos 5907446.899 3|__a7sozs 3272.03 5287724.593 @ 10.25 268 2.08 4971829 658 5 1.25/0.25 231 2.12 4796513.568° 6 15/025 mal 237 5175570.826 778/025 zoz| 2.22 5630756.141 3 0.28 12] 2.26 6025170.125 TpLH vs W/L P(uW) vs W/L sm a wo 7 ewes |.2 onan 2 us CMOS NAND Gate The worst ease delay of the CMOS is either when only one input is low or both inputs are high. . The following table and graphs present the de: AQWiL}s | TpHiLips) | PeuW) ANT2)+(P) | 1 02: 565 28 2502724 2 0.51025 300 | 2.825 14365125 3 0.751025 205 [2.875 1042088 672 4 10235 iea| 291 ‘888007 2656 5 12/025 128 700876.8 6 T3025 110 (GA2555375 7 175/025 97 3 92767 3 20.25 83 [3.025 366899:52 TpHL vs W/L we % = ety | 285 275 100 2 ° 265 123 5 6? 8 2 and results of the optimization. P(uw) vs W/L 1G AND The following table and graphs present the de and results of the optimization. AQW/L)n PW) “AMC2)*(P2) 7 0.25025 0117 1853.819136 2 05/025 $336 891298 3 0751035 oa 9815.52 4 1025 054 15695. 0784 s 125025 | 0.675 2143487813 6 13/035 so 0.825 30203415 7 125 mo] 097 1105 0983) 5 2025 B Li $2526 8872 TpHL vs W/L Pvs W/L 400 12 . é «= 2 ° 20 Inverter Gate Note:(PUN W/L),=2 (PDN W/L)sto ensureV. sets to Vay/2. The following design and results of the optimization able and graphs present the AQW/L}e | “TpHLIps) Sees) r 0.25/0.25 367 1000150.006 2 05/025 193 363391125) 3 | ors02s 140 45442992 4 10.25 103 335078 8996, 5 1.25/0.25 85 292396 6531 6 T3025 7 256200 5376 7 175/025 62 224738, 3068 8 20.25 55 207046125 TpHL vs W/L Pvs W/L a0 20s 00 ass = —TeHlips) | 2.75 Pu) 300 iat ° 26 tas as ee 123 4 5 6 Testing Cireuit Logic Output To prove that the adder is working and producing correct logic, we did some waveform tests: ‘We put these input combinations, and monitor the output form: 0101+ xy0l where x is a pulse from 01t0 1 For instance when x=0 & y=1, output should be 1010. When x=0 & y=0, output should be 0110 When x=1 & y=0, output should be L110 And the output waveform figure below proves that the adder produces the expected output Note: x resembles B3, y resembles B2, MSB for sum=S3, then $2 The following graph shows the output of the adder In the 256 outputs logic diagram of the 4-bit adder, some spices appear due 10 errors ) syechronization between the inputs of some gates. To get rid of that a simple solution can be to add buffers to the faster signal, however these spices do not affect the logic af'the design Evaluation Phase ‘This section presents the methodology used to test and measure performance parameters. Measuring worst case delay Based on eritical path analysis, we applied an input such that the LSB of one of the inputs triggers the MSB of the sum (last bit in the sum), so that the signal can ripple through the critical path of the Ripple-Carry Adder, going from Cou of a 1-bit adder to the C.. of the other. Inputs illustrated in this diagram Initial condition Final condition AOL AOI +B: 0000 +B: 0001 5: 1000 ‘Where the MSB of the sum will change fiom low to high, in response to the LSB of the B signal transiting from low to high. Showing in the following schematic the critical path for this adder: Measured quantities: ‘TpLH: 0.92 ns Rise time: 557 ps BREESE 4 Volage (1 23 Calculating number of transistors Gate/Unit Composed of Firansistors: DPL-XOR 4 inv 8 TG: AND 3a" inv) 3 CMOS-NAND. q 4 CPL-NAND 2+ 2inv 6 Inverter 2 2 Half Adder TDPL- XOR*11G-AND i =845 Full-Adder TDPL-XOR + 2 CPL-NAND* 1 32 CMOS-NAND=16+12+4 Four-bit adder = 1 HA +3 FA= 13 + (3x32)=109 transistors Measuring power cansu Power consumption was measured by measuring average current supplied by Vas and multiplying it by Vas Frequency used: 10 MHz WVq)-0.28uA, Re-evaluation Phase: ‘We decided to do some other testing, we realized that we can replace the two CPL NAND of the Full Adder ‘with CMOS NAND, based on simulation results, the CPL is better in terms of power, while CMOS is better in terms of delay After we did the simulation to the 4-bit adder, one time using the CPL NAND, while the other time replacing the two CPL NANDs with CMOS NANDs, in the Full Adder Circuit above, we got these results Fb RCA bit RCA using CPL NAND in FA_| using CMOS NAND in FA # transistors 109 7 Power 0.7uW 0.2SuW ‘Area SIs 707 Delay (tp) 0.92 ns 0.99 ns ‘Operating frequency=10 MHz a Veleage 07 th ak th te tb tee Dae te ae ole oy A dee Wee TyseTPLH for the improved re-evaluated adder ‘This proves that using the CMOS NAND instead of the CPL NAND is actually a better option. Measuring ‘our cost function *T*Pfor: 4-bit RCA using CPL NAND: cost function= 524.86 44-bit RCA using CMOS NAND: cost function= 174.98 Itis obvious that using CMOS NAND is a great and major improvement in the performance of the adder, and is surely considered instead of the CPL NAND, 26

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy