Bang Bang PD
Bang Bang PD
SJSU ScholarWorks
Master's Theses Master's Theses and Graduate Research
2014
Recommended Citation
Waghela, Sagar, "Phase Locked Loop (PLL) based Clock and Data Recovery Circuits (CDR) using Calibrated Delay Flip Flop"
(2014). Master's Theses. Paper 4485.
This Thesis is brought to you for free and open access by the Master's Theses and Graduate Research at SJSU ScholarWorks. It has been accepted for
inclusion in Master's Theses by an authorized administrator of SJSU ScholarWorks. For more information, please contact scholarworks@sjsu.edu.
PHASE LOCKED LOOP (PLL) - BASED CLOCK AND DATA RECOVERY
A Thesis
Presented to
In Partial Fulfillment
Master of Science
by
Sagar Waghela
August 2014
© 2014
Sagar Waghela
by
Sagar Waghela
August 2014
by
Sagar Waghela
A Delay Flip Flop (DFF) is used in the phase detector circuit of the clock and data
recovery circuit. A DFF consists of the three important timing parameters: setup time,
hold time, and clock-to-output delay. These timing parameters play a vital role in
designing a system at the transistor level. This thesis paper explains the impact of
metastablity on the clock and data recovery (CDR) system and the importance of
calibrating the DFF using a metastable circuit to improve a system’s lock time and peak-
to-peak jitter performance. The DFF was modeled in MATLAB Simulink software and
calibrated by adjusting timing parameters. The CDR system was simulated in Simulink
for three different cases: 1) equal setup and hold times, 2) setup time greater than the hold
time, and 3) hold time greater than the setup time. The Simulink results were then
compared with the Cadence simulation results, and it was observed that the calibration of
DFF using a metastable circuit improved the CDR system’s lock time and jitter tolerance
performance. The overall power dissipation of the designed CDR system was 2.4 mW
I would like to express my gratitude to my advisor Dr. Shahab Ardalan for guiding
me throughout this research with his invaluable knowledge. He was always supportive of
my work since I began studying analog mixed signal courses at San Jose State
University. I would also like to thank him for providing access to analog mixed signal
(AMS) lab with the most recent simulation software tools that are used in the industry.
I also would like to thank my committee members Dr. Sotoudeh Hamedi-Hagh and
Dr. M. J. Zoroofchi for serving as my committee members. I thank them for their
immense support and their helpful ideas throughout my education and research at San
parents and siblings for always been supportive to my work, studies, and research. I am
quite grateful to have all of you in my life. Without their encouragement, I would not
v
TABLES OF CONTENTS
vi
References................................................................................................................... 73
Appendix……………………………………………………………………………. 75
A.1 Verilog A Codes .......................................................................................... 75
A.1.1 PRBS-7 Data Generator ........................................................................... 75
A.1.2 Multiplexer ............................................................................................... 77
A.1.3 Charge Pump ............................................................................................ 77
A.1.4 Voltage Controlled Oscillator .................................................................. 78
A.1.5 Slicer ........................................................................................................ 79
A.2 Transistor Sizes ............................................................................................... 80
A.2.1 Semi dynamic DFF (SDFF) ..................................................................... 80
A.2.2 Exclusive OR gate (XOR)........................................................................ 80
A.2.3 Inverter ..................................................................................................... 81
A.2.4 Data delay cell .......................................................................................... 81
A.2.5 Clock delay cell ........................................................................................ 82
vii
LIST OF FIGURES
Figure 1.1: NRZ data rate for high performance differential pair point-to-point
Figure 1.3: Basic block diagram of the clock and data recovery system. ..................... 4
Figure 2.3: Basic linear phase detector and output waveforms. ................................... 9
Figure 2.4: Hogge phase detector and its output waveform. ...................................... 10
Figure 2.6: Circuit diagram of the Type-II Low Pass Filter. ...................................... 13
viii
Figure 2.16: Schematic of Semi-dynamic DFF (SDFF). ............................................ 23
Figure 2.19: Glitch data generation from the input data. ............................................ 27
Figure 2.20: Variation in metastable window of glitch data using six digital bits. .... 27
Figure 2.21: Variation of the clock using five digital bits. ......................................... 28
ix
Figure 4.5: Output waveform of the designed gate..................................................... 56
Figure 4.10: Cadence schematic of the data delay cell [15]. ...................................... 61
Figure 4.11: Variation in the left leg of the metastable window. ............................... 61
Figure 4.12: Variation in the right leg of the metastable window. ............................. 62
x
LIST OF TABLES
Table 4.2: Binary vectors to vary the left leg of the metastable window. .................. 62
Table 4.3: Binary vectors to vary the right leg of the metastable window. ................ 63
Table 4.5: Initialization of the variables of the designed CDR system ...................... 67
Table 7.4: Transistor sizes of the designed data delay cell. ........................................ 81
xi
ABBREVIATIONS
s Second
ns Nanosecond
ps Picosecond
V Volt
mV Millivolt
PP Peak-to-peak
mW Milliwatt
UI Unit interval
µA Microampere
KHz Kilohertz
MHz Megahertz
GHz Gigahertz
KΩ Kiloohm
nF Nanofarad
pF Picofarad
xii
DC Direct current
deg Degree
PD Phase detector
CP Charge pump
LC Inductor-capacitor oscillator
xiii
Chapter 1. Introduction
In wire-linked communication systems, when data flows over a single wire without
any accompanying clock, the receiver of the system is required to process this data
synchronously. Therefore, the CDR circuits are used in the receiver of the system to
recover the clock or timing information from these data. Data bandwidth for wire-linked
(NRZ) data rate for high-performance differential pair point-to-point nets on the package
would reach 100 Gbps by the year 2019 as shown in Figure 1.1 [1]. In such high-speed
wire-linked communication systems, these data are corrupted both by internal and
external noise during its passage from transmitter to receiver, resulting in jitter and skew
in the data received at the receiver. Here, the clock and data recovery circuit is necessary
to extract the data transmitted by the transmitter from the corrupted received signal and
also to recover the accompany clock timing information at the receiver side of the
communication systems.
1
Figure 1.1: NRZ data rate for high performance differential pair point-to-point nets on
a package, based on the ITRS 2007 roadmap prediction [1].
system is shown in Figure 1.2. In a source asynchronous system, the transmitter and
receiver use different clock sources of the same frequency. As seen from Figure 1.2, the
received data are first equalized in the receiver input buffer and then fed to the CDR
circuit for retiming before proceeding into the deserializer module. Hence, there exists a
frequency offset between the transmitted data and the local clock on the receiver side due
to natural device mismatches, which creates the challenges for CDR circuit designers.
2
Parallel Parallel
Reference
Data Output Data Input
Clock Input
Deserializer Clock
Synthesizer Serializer
Clock and
Data Transmitter
Recovery Output Buffer
with
Pre-emphasis
Receiver
Input Buffer
with Equalizer
Transmission
Channel
Transmission
Channel
Receiver
Input Buffer
with Equalizer
Transmitter
Output Buffer
with Clock and
Pre-emphasis Data
Recovery
Clock Deserializer
Serializer
Synthesizer
Parallel
Parallel Reference Data Output
Data Input Clock Input
3
1.1 CDR
The basic building blocks of a CDR circuit include a clock recovery and data retiming
blocks as shown in Figure 1.3. The function of the clock recovery circuit is to detect the
transitions in the received data and generate a periodic recovered clock. This recovered
• The recovered clock’s frequency must be equal to the input data rate.
• The recovered clock should have reasonable timing with respect to the input data
(i.e., the rising edge of the recovered clock should sample at the center of the data
bit, to provide maximum margin for jitter and other time uncertainties).
• The recovered clock should exhibit a minimum jitter because the jitter of the
The data retiming circuit uses a Delay Flip Flop (DFF), which is triggered by the
recovered clock to retime the received data. The DFF samples the corrupted received
data and regenerates the data with less jitter and skew [2].
Recovery
Clock
Clock
Recovery
Figure 1.3: Basic block diagram of the clock and data recovery system.
4
1.2 Motivation and Agenda
This thesis presents the effect of metastability on a CDR system and the importance
of calibration of a DFF using a metastable circuit to improve CDR system’s lock time
and jitter tolerance performance. The metastability effect refers to a violation of setup
and hold time requirements of a DFF. A DFF samples input data on an active-edge of a
clock and this sometimes occurs at a data transition point, providing an incorrect result at
the DFF output. Thus, there is a need to calibrate (i.e, delay or advance) the active edge
of the clock using a metastable circuit to satisfy the setup and hold time requirements for
the DFF. Once calibrated, the input data gets sampled at the center of the data bit interval
Chapter 2 provides a general background on a CDR system and explains each block
of the system in detail. Chapter 3 explains the design of the CDR system using Simulink
software. The CDR system with and without the metastable DFF was simulated for four
different cases. Chapter 4 explains the CDR design at transistor level using 45 nm
technology and modeling using Verilog A language in Cadence Virtuoso 6.1.5. This
chapter also explains the design of a DFF, an Alexander phase detector, and a metastable
circuit at the transistor level. The designed CDR system was simulated for three different
cases and the results were compared with that obtained by using Simulink.
5
Chapter 2. Background
In source asynchronous communication systems, data are transmitted by the
transmitter without an accompanying clock and the receiver has to process these data at
their end synchronously, recovering the clock from the data. A phase lock loop (PLL) is
2.1 PLL
A PLL is a negative feedback system where a clock generated by the voltage control
oscillator is phase and frequency locked to an input data. The basic topology of PLL
The function of the Pre-Amplifier and Limiter is to generate a full voltage swing from
the input data, required by the phase detector. The functions of the PD, CP and LPF, and
6
Figure 2.1: PLL based CDR system.
The function of the phase detector is to measure the phase difference between two
incoming signals. Examples are clock and data signals, data and data signals, and clock
and pseudo random bit data (PRBS) signals. Various topologies and designs for phase
detectors already exist, such as the Alexander Phase Detector, the Hogge Phase Detector,
Phase detectors are broadly classified into two classes: linear and binary phase
detectors. Linear phase detectors (PD) are used in low to medium speed CDR
(GHz) speed.
7
In the case of linear phase detectors, the output of the phase detector is linearly
proportional to the phase difference between two input signals as shown in the
The slope of the line is called phase detector gain and is calculated by following
equation.
𝑉𝑝𝑑
𝐾𝑝𝑑 = (2.1)
𝛥ɸ
In the above equation, phase detector gain is defined as Kpd, the average phase
detector output is defined as Vpd, and the difference between two input signals is defined
8
as ∆𝜙. As the phase difference between two input signals increases, the average phase
detector output also increases. Hence, the phase detector gain remains constant [2]. One
DFF with an XOR gate is enough to satisfy the requirement of linear phase detector, but
as the average value of the phase detector output is a function of the data transition
density of the input, this design fails to uniquely represent the phase difference for
various data patterns as shown in Figure 2.3, thus, this design is data pattern dependent
[2].
Output
FF1
Data D Q A
Clock
Data Data
Clock Clock
A A
Output Output
t t
9
One example of a linear phase detector is the Hogge Phase Detector. The circuit
implementation and output waveform is shown in Figure 2.4. The Hogge Phase Detector
consists of two DFFs and two Exclusive OR (XOR) gates. The function of a DFF is to
produce a delayed replica of the input signal at its output. The first DFF, named FF1,
produces a delayed replica of the input data at the rising edge of the clock and is then
XORed with input data. The output of the XOR gate, named X, gives the phase
difference between two input signals. To avoid the problem of data pattern dependency,
the proportional pulses obtained at node X are accompanied by reference pulses at node
Y, which are generated by using an additional DFF (FF2) and XOR gate. The reference
pulses appear on the data edge and have constant pulse width, thus avoiding the pattern
FF1 FF2
Data Q A
D D Q
A B
Clock
Data
Clock
10
In binary phase detectors, the output is either logic one or zero. One example of a
binary phase detector is the Alexander Phase Detector. The Alexander Phase Detector
accepts two input signals (e.g. clock and data) and determines whether the clock is earlier
or later than the data. If the clock is earlier than the data, the early node goes to logic one
and the late node goes to logic zero. Otherwise, when the clock is later than the data, the
late node goes to logic one and early node goes to logic zero. A more detailed
The function of the charge pump is to convert the output voltage of the phase detector
to current. This current is then fed to a low pass filter, where the capacitor is either
charged or discharged depending on the phase detector output. The circuit diagram of the
charge pump with a Type-I LPF (capacitor) is shown in Figure 2.5 and a Type-II LPF is
In this research, the Alexander Phase Detector is used, where the output is either
early or late. The early and late nodes are connected to respective switches of the charge
pump circuit, as shown in Figure 2.5. When the early node is high, closing the early
switch, the capacitor starts charging and continues to charge until the early node goes
low, opening the early switch. Similarly, when the late node goes high, the capacitor
starts to discharge and will continue to discharge until the late node goes low.
Designing a charge pump is not an easy task, because to achieve zero net voltage on
the capacitor, the charging current should be equal to the discharging current. Even if the
11
charging and discharging currents are designed to be close to equal, there will still be
leakage current through the charge pump circuit, resulting in an offset voltage on the
capacitor. One way to minimize this offset voltage is to calibrate the charge pump circuit
Icp
Early
C
Late
Icp
dV
Ic = C (2.2)
dt
I
Vc = ∫ c dt (2.3)
C
The function of the low pass filter (LPF) is to convert the charge pump current into
control voltage. The Type-I LPF is replaced by Type-II LPF due to trade-offs between
the settling time, ripple on the control voltage, and the phase error and stability. To
minimize the ripples on the control voltage, the capacitor from Figure 2.5 is replaced by
12
the resistor (R) in series with the capacitor (C1), both in parallel with the capacitor (C2),
as shown in Figure 2.6. If the capacitor (C2) is five to ten times less then capacitor (C1),
then the Type-II LPF will still approximately behave as a Type-I LPF [3].
C2
C1
The function of the voltage control oscillator is to generate the clock signal at its
output, the frequency of which can be changed by varying the input control voltage [4].
The circuit to oscillate at ω0, it must satisfy two conditions as shown by equation
13
Two types of CMOS oscillators used widely in today’s technology are ring oscillators
and inductor - capacitor (LC) oscillators. A ring oscillator consists of an odd number of
gain stages in a loop as shown in Figure 2.8 and the bode plot of three stage ring
20log | H(ω) |
-60dB/dec
0 ωp ω (log scale)
0
ω (log scale)
o
-90
-135o
-180o
∠H(ω)
14
Figure 2.8: Three-stage ring voltage controlled oscillator.
The LC oscillator consists of the cross-coupled common source stages loaded by the
inductor (L) placed in parallel with the capacitor (C) as shown in Figure 2.9.
VDD VDD
Lp Cp Lp Rp Cp
Rp
Vout
M1 M2
15
The cross-coupled LC voltage controlled oscillator is given by Figure 2.10 [5].
VDD VDD
Lp Cp Lp Rp Cp
Rp
Vout
M1 M2
Mv1 Mv2
Vcontrol
Iss
There are many oscillator specifications that one must be aware of. The ring
oscillators have a high frequency tuning range and small area consumption whereas the
LC oscillator has a limited frequency tuning range and large area consumption
more detail in the next section. Ring oscillators are preferred over LC oscillators in
implementing VCO due to their attractive features of frequency tuning range and area
consumption. But, the ring VCO provides low quality factor [6]. Thus, the choice of
16
2.2 Jitter in CDR Circuits
Jitter is defined as the amount of variation in the waveform from their ideal position
at zero crossing on the time axis. The optical communication (OC) standards for the
CDR circuits in terms of jitter are more stringent and difficult to achieve. The jitter is
expressed in terms of the bit period or unit interval (UI) by OC standards. For example, a
jitter of 0.0001 UI refers to 0.1% of the bit period. Jitter in the CDR circuit is
• Jitter transfer
• Jitter generation
• Jitter tolerance
The jitter transfer function in CDR circuits is output jitter when the input jitter is
changed at different rates. If the input jitter varies slowly, then the output of the CDR
circuits will track the input to ensure phase locking; however, if the input jitter varies at a
fast rate, then the output will track the input to a lesser extent, (i.e., the CDR circuit must
filter the input jitter). Thus, the jitter transfer function has the same characteristic as that
of a low pass filter as shown in Figure 2.11. The OC standards have two specifications
(i.e., CDR must suppress the jitter components above 120 kHz).
2. The jitter peaking shown in Figure 2.11 must be less than 0.1dB.
17
out
in
1 0.1dB
Jitter
Frequency
Multiple data regenerators are placed along a signal’s path to minimize the non-ideal
regenerator should have small jitter bandwidth, in order to minimize the accumulated
jitter through the chain of data regenerators. Furthermore, as the total jitter transfer
function of the data regenerators connected in series is given by the product of the
individual data regenerator’s jitter transfer function, it is important to have small jitter
Jitter generation is defined as the jitter produced by CDR circuit elements when the
input data signal itself does not have jitter. The major sources of jitter generation are:
• Phase noise error in the VCO caused by the noise generated by VCO.
18
• The ripples on the control voltage due to leakage current from the charge
pump circuit.
• The coupling of the input data transition to the VCO through a retiming circuit
The jitter in the output of the VCO caused by the noise generated is shown in Figure
2.12.
VCO Output
0 t
Jitter tolerance is defined as the amount of jitter that the CDR circuit must tolerate on
the input data without increasing the bit error rate (BER). If the jitter on the input data
varies slowly, the recovered clock will track the transition in the data and always sample
the data in the middle of the bit period as shown in Figure 2.13. This will guarantee a
low BER. Whereas, if the jitter on the input data varies fast, the recovered clock will not
be able to track the transition in the data and will fail to sample the data in the middle of
the bit period as shown in Figure 2.14. This will result in a greater BER.
19
Data
Clock
Data
Clock
The specification of the jitter tolerance is described by a mask, which is the function
of the input jitter frequency, shown in Figure 2.15. Jitter tolerances for the various
OC/SONET standards are shown in Table 2.1 [8]. For example, the CDR circuit must
tolerate a peak-to-peak jitter of 1.5UI, if the jitter on the input signal varies at the rate
below 6 kHz.
20
Jitter
Tolerance
(UI)
15
-20dB/dec
1.5
-20dB/dec
0.15
f0 f1 f2 f3 f4 Jitter Frequency
(log scale)
delay flip flop (DFF) and its timing parameters. This research has used the Alexander
phase detector in the designed CDR circuit, which consists of four DFFs and two XOR
21
gates. The phase detector is considered as the heart of the CDR circuit. The function of
the DFF is to produce the delayed replica of the input signal. The DFF can be
implemented using only static circuits, only dynamic circuits, or a combination of static
The static implementation stores the data in the form of a charge on parasitic
capacitors associated with the MOSFET for an extended period of time, thus, causing an
increase in the leakage current. In contrast to the dynamic implementation that stores the
data in a similar manner but for a short period of time in the range of milliseconds,
resulting in the reduction of leakage current [10]. Due to this advantage, dynamic
higher performance and lower power dissipation making them useful in designing high
The type of DFF used in this research is the semi-dynamic DFF (SDFF), shown in
Figure 2.16, which was first introduced by Klass [9]. The SDFF consists of both static
and dynamic circuits. The main reasons for choosing this type of DFF over the others are
short latency, small clock load, small area, and a single-phase clock scheme [9]. An
additional feature of the SDFF is that it can incorporate various logic functions with small
22
Clock M1 M5
X Q Qbar
M2 Clock M6
Data M3
M7
Clock
M4
• The SDFF has short latency because the SDFF is refreshed periodically by the
clock signal, thus, keeping the SDFF in an idle state for very short period of
time.
• The SDFF implemented in this research has a small clock load of five
23
count, but the drawback with such implementation is a high clock load of
overlapping between clock and delayed clock, thus, providing a direct path
between the input and output signal, destroying the state of the circuit [10].
To avoid this, a TSPC scheme is used, which uses a single clock to drive the
entire circuit.
Thus by taking in account the above features, the SDFF is implemented using a
There are three timing parameters associated with DFF as shown in Figure 2.17 and
• Setup Time (Ts): Defined as the minimum time interval between the rising
edge of the clock and that of the input data signal, such that the input is
reliably sampled.
• Hold Time (Th): Defined as the minimum time interval between the falling
edge of the input data signal and the rising edge of the clock such that the
24
• Clock to output delay (Tc-q): Defined as the time interval between the rising
edge of the clock and the rising edge of the output signal when the input data
is reliably sampled.
Clock
Ts Th
D Q
Data
Tc-q
Clock
Q
If a violation of the setup or hold time takes place, then the DFF output is not a
guaranteed sample of the input data, possibly leading to the wrong logic level. This
happens due to the input data not having enough time to toggle between high and low
signal logic levels. Thus, the data goes into an idle state due to the setup and hold time
violations and will hold the value obtained from previous successful sampling event.
Figure 2.18 shows the sampling of the input data by the rising edge of the clock in
two scenarios. In the first scenario, DFF1 samples the input data by the rising edge of the
clock during the data transition. In this case, the setup and hold time requirements are
violated; thus the result at the DFF1’s output will be an incorrect logic level. In the
second scenario, DFF2 samples the input data by the rising edge of the clock at the
25
middle of the data bit interval. In this case, there is no violation of timing parameters,
thus the result at the DFF2’s output will be a correct logic level. For the system to
perform efficiently, the active edge of the clock should sample the input data at the
middle of the data bit interval to allow maximum margin for setup and hold time(s). This
idea is referred to as a metastable concept in this research and results in increasing the
DFF1
Clock D Q
Clock
Data
Clock D Q
DFF2
In this research, the metastable circuit was designed and the clock delay cell from the
metastable circuit was placed in the CDR system, such that the clock always samples the
data at the middle of the input data bit interval. The metastable circuit is explained in
detail in section 4.5. This circuit first converts the input data into glitch data, whose
width or metastable window is equal to the sum of setup (Ts) and hold (Th) times, shown
in Figure 2.19.
26
Input Data
Glitch Data
Metastable window
= Ts + Th
The metastable window’s rising edge (left leg) and falling edge (right leg) each can
be varied or aligned using three digital bits, with a total of six digital bits to vary the
Metastable window =
Ts + Th
Glitch Data
Input Binary
Vector
Glitch Data
Input Binary
Vector
Glitch Data
Figure 2.20: Variation in metastable window of glitch data using six digital bits.
27
The rising edge of the clock is aligned at the center of the metastable window by
using a five-bit variable clock delay cell as shown in Figure 2.21. Thus, the clock delay
cell from the metastable circuit is placed in the designed CDR system.
Input Binary
Clock
Vector
Delayed Clock
28
Chapter 3. CDR MATLAB and Simulink Models
The CDR system is modeled using MATLAB and Simulink software as shown in
Figure 3.1. The modeled CDR system has the input data rate of 3 Gbps and the voltage
• Charge pump
• Low-pass filter
29
From
1 +RC1s
PD CP VCO
C1C2s2 + (C1 + C2)s
Alexander Charge Pump Low Pass Flter Voltage Controlled
Phase Detector Oscillator
PRBS7
To
PRBS7
systems. A PRBS is a random bit sequence that repeats itself. PRBSs are used in testing
hardware circuits that are used in communication systems. PRBSs are generated by
shifting bits through the number (n) of cascaded shift registers. Some of the shift
register’s output are added with a modulo-2 function and fed to the input of the first shift
register [12]. The PRBS-7 consists of seven shift registers as shown in Figure 3.2 and
30
There are various types of PRBS data generators like PRBS-7, PRBS-9, PRBS-15,
PRBS- 31, etc., and are used depending on application. Table 3.1 presents the properties
1 1 1 1 1 1 1 Output
Z Z Z Z Z Z Z
7 X7 + X6 + 1 127 64 63
31
3.1.1 Alexander phase detector (PD)
The Alexander PD is a binary phase detector and provides the inherent data retiming
for the CDR system [2]. The Alexander PD is used in the high-speed CDR circuits that
operate at GHz speed. The Alexander PD consists of four DFFs and two XOR gates as
shown in Figure 3.3 and its characteristic is shown in Figure 3.4. The Alexander PD uses
three data samples S1-S3 that are sampled by the three consecutive clock edges. The
the input data, and 2) Whether the clock is earlier or later than the input data.
When there is no transition in the input data, all the three samples will have equal
values and no action is taken by the Alexander PD. If the falling edge of the clock leads
(is “early”) then the first two samples S1 and S2 will have equal values and the last
sample S3 will have a value, unequal to that of first two samples. Conversely, if the
falling edge of the clock lags (is “late”) then the last two samples S2 and S3 will have
equal values and the first sample S1 will have a value, unequal to that of last two samples.
The decisions of the Alexander PD depend on the values of the three samples (S1, S2, and
In Figure 3.3, the first flip flop (FF1) samples the input data at S1 and S3 on the rising
edge of the clock and the second flip flop (FF2) delays the output of the first flip flop
(FF1) by one clock cycle. The third flip flop (FF3) samples the input data at S3 on the
falling edge of the clock and the fourth flip flop (FF4) delays the output of the third flip
32
As seen from the waveform of Figure 3.3, for the early case, the FF1 samples the high
data level (logic one) at the first rising edge of the clock. At the second rising edge of the
clock, the FF2 performs two functions: 1) Produces the replica of the first sample (S1)
delayed by one clock cycle, at the output of the FF2, and 2) Samples the low data level
(logic zero).
The FF3 samples the high data level (logic one) at the first falling edge of the clock.
At the next rising edge of the clock, the FF4 produces the replica of the second sample
(S2) delayed by half a clock cycle, at its output. The clock phases of all the four DFFs
should be such that, the three samples S1, S2, and S3 reaches a valid logic level for
comparison at t = T1 and remains constant for one clock period. Once the three samples
S1, S2, and S3 reaches valid logic level and remain constant for one clock period, the
XOR gate produces a valid logic level at the output. The same process is vice versed for
33
Early
FF1 FF2
Data A A Q2
D Q D Q
Q1 Late
Clock
Q3 Q4
D A Q D A Q
FF3 FF4
S3 S1
Data Data
Clock Clock
Q1 S3 S3
Q1
S1 S1
Q2 Q2
Q3 Q3
S2 S2
Q4 Q4
Early Early
Late Late
T1 t T1 t
34
Figure 3.4: Ideal characteristic of Alexander PD.
S1 S2 S3 Decision
0 0 0 Cannot determine whether the clock is earlier or later than the data.
1 1 1 Cannot determine whether the clock is earlier or later than the data.
35
3.1.2 Charge pump (CP)
The function of the charge pump is to convert the phase difference between the two
input signals into the electrical parameter such as voltage, which controls the oscillating
frequency of the VCO [13]. The charge pump circuit is modeled in the Simulink by
using a gain block and an adder block as shown in Figure 3.5. A gain block holds the
value of the charge pump current Icp that charges or discharges the capacitor, when early
The Icp value is set to 800 µA and is divided by 2π to cancel the radians unit of the
Kvco (gain) value of VCO. When the early signal is at logic level one, the capacitor is
charged to 127.32 µV (800 µA /2π) and when the late signal is at logic level one, the
capacitor is discharged to -127.32µV, but in reality the capacitor discharges to zero volts.
The charging and discharging currents of charge pump circuit are easily cancelled in the
Simulink model, but in reality leakage current flows through the circuit, thereby creating
36
Early Convert
The low pass filter is modeled in the Simulink as per the Figure 2.6 (The resistor (R)
is in series with capacitor (C1) and both are in parallel with capacitor (C2)) by using a
transfer function block. The transfer function of LPF circuit is derived as follows and is
1 1
H(s) = � R + C1
� ││ C (3.1)
s 2s
1 + R C1 s 1
� �.� �
C1 s C2 s
H(s) = 1 + R C1 s 1 (3.2)
� �+� �
C1 s C2 s
37
1 + R C1 s
s2 C1 C2
H(s) = C1 s + R C1 C2 s2 + C2 s
(3.3)
s2 C1 C2
1 + R C1 s
H(s) = (3.4)
R C1 C2 s2 + (C1 + C2 )s
The VCO is modeled in the Simulink by using an adder, a constant, and a math
function blocks as shown in Figure 3.6. The VCO model consists of the two variables
named Kvco and fo. The Kvco is the gain of the VCO in rads/Volts and fo is the oscillating
frequency of the VCO in Hz. The frequency of the clock generated by the VCO varies
linearly with the input terminal Vcontrol. When the input terminal Vcontrol is zero, the VCO
produces the waveform that oscillates at frequency fo and on increase in the input
terminal Vcontrol, the oscillating frequency of the waveform generated by VCO increases
linearly.
1 2*π*freqency*Ts +
1
Constant +
Gain Z mod sin
+ Unit Delay
VCO
Math Trigonometric Relay Output
Adder Function Function
2*π*Kvco*Ts 2*π
Vcontrol
Gain Constant
38
3.2 Phase lock loop (PLL) dynamics
The PLL is known as a second order system because it consists of the two dominant
poles. The first pole is contributed by a combination of the charge pump and a low pass
filter and the second pole is contributed by the VCO. The PLL in terms of the control
+
Øi KI KV
KØ s s
- Øo
+
Vc
Øn
KI
HLPF (s) = K P + (3.5)
s
1
KI = , KP = R (3.6)
C
ICP
Kɸ = (3.7)
2π
39
The open and closed loop transfer functions of the PLL are given by equations (3.8)
KVCO Kɸ (KI + KP s)
G(s) = (3.8)
s2
s
K (1+ )
H(s) = s
z
(3.9)
s2 + K + K
z
KI
K = K VCO K ɸ K I and z = (3.10)
KP
The open and closed loop functions of the PLL in terms of the cut-off frequency of
the PLL loop ωn and the damping factor ξ are given by equations (3.11) and (3.12).
2 ξ ωn s+ ωn 2
G(s) = (3.11)
s2
2 𝜉 ωn 𝑠+ ωn 2
𝐻(𝑠) = (3.12)
s2 +2 𝜉 ωn 𝑠+ ωn 2
The cut-off frequency of the PLL loop ωn and the damping factor ξ can be formulated
as follows:
40
I K
ωn = �K V K ɸ K I = � CP V (3.13)
2π C
ωn KP R I KV C
ξ= = � CP (3.14)
2 KI 2 2π
From the dynamics and formulations mentioned above, the loop bandwidth and the
phase margin can be plotted in order to determine the stability and the allowable
The variables of the designed CDR model are initialized as shown in the Table 3.3.
To test the stability of the designed CDR model, a frequency step in the input data is
performed, i.e., the input data frequency is changed from 3 Gbps to 2.5 Gbps at 1 µs and
the appropriate change in the value of the control voltage (Vcontrol) of the VCO is
observed.
The equation (3.15) is used to manually calculate the value of the Vcontrol of the VCO
𝑓𝑣𝑐𝑜 − 𝑓𝑂
𝑉𝑐𝑜𝑛𝑡𝑟𝑜𝑙 = (3.15)
Kvco
41
Where the fo is the oscillating frequency of the VCO, the fvco is the frequency of the
clock generated by the VCO, and the Kvco is the gain of the VCO.
Variables Value
Icp 800 µA
R 1 kΩ
C1 1 pF
C2 C1/10
The designed CDR model is simulated for four different cases as follows:
• Case 1: The designed CDR model is simulated using an ideal DFF from the
Simulink library in the Alexander PD model for 4 µs. The Vcontrol of the VCO
during the frequency step in the input data from 3 Gbps to 2.5 Gbps is plotted
in the Figure 3.8. The lock time is defined as the time taken by the Vcontrol of
the VCO to settle to a constant value, when the frequency step in the input
data is performed. The lock time of the designed CDR model was 0.110 µs.
The eyediagram of the recovered clock shows the peak-to-peak jitter present
in the recovered clock and for the designed CDR model is shown in Figure
42
3.9. The peak-to-peak jitter observed in the recovered clock of the designed
CDR system was 0.03 UI for the 3 Gbps input data and 0.033 UI when the
43
• Case 2: The designed CDR system is simulated using the metastable DFF,
modeled in Simulink, for 4 µs. The SDFF designed in the transistor level
using 45 nm technology in the Cadence Virtuoso has the setup time (Ts (actual))
equal to -11.57 ps, the hold time (Th (actual)) equal to 54.16 ps, and the clock-to-
output delay (Tc-q) equal to 84.64 ps (explained in detail in section 4.1 and
shown in Table 4.1). The pulse width of the metastable window of the input
Ts (actual)+ Th (actual)
Ts = Th = = 21.29 ps (3.16)
2
The designed CDR model is simulated and the Vcontrol of the VCO is plotted in
Figure 3.10. The lock time of the CDR model was 1.7 µs. The peak-to-peak
jitter observed in the recovered clock was 0.04 UI for the 3 Gbps input data
and 0.037 UI when the input data frequency was changed to 2.5 Gbps at 1 µs.
44
Figure 3.10: Control voltage of the VCO for Case 2.
• Case 3: The designed CDR system is simulated using the metastable DFF,
modeled in Simulink, for 4 µs. The setup time (Ts) is initialized to a greater
value than the hold time (Th) in the modeled metastable DFF as follows:
The designed CDR model is simulated and the Vcontrol of the VCO is plotted in
Figure 3.11. This figure shows that the designed CDR model does not lock
when the input frequency is changed from 3 Gbps to 2.5 Gbps at 1 µs. The
peak-to-peak jitter observed in the recovered clock was 0.0612 UI for the 3
45
Figure 3.11: Control voltage of the VCO for Case 3.
• Case 4: The designed CDR system is simulated using the metastable DFF,
modeled in Simulink, for 4 µs. The hold time (Th) is initialized to a greater
value than the setup time (Ts) in the modeled metastable DFF as follows:
The designed CDR model is simulated and the Vcontrol of the VCO is plotted in
Figure 3.12. This figure shows that the designed CDR model does not lock
when the input frequency is changed from 3 Gbps to 2.5 Gbps at 1 µs. The
46
peak-to-peak jitter observed in the recovered clock was 0.05 UI for the 3 Gbps
input data.
The results of the four different cases are tabulated in Table 3.4:
Case Type of DFF used Lock Time Peak-to-Peak Jitter Peak-to-Peak Jitter
47
In summary, as seen from the Table 3.4, Case 1 is the best case as the designed CDR
system has the minimum lock time and peak-to-peak jitter as compared to Case 2, 3, and
4. In Case 1, the ideal DFF from Simulink library was used in the Alexander phase
detector. The timing parameters of the ideal DFF from Simulink library are the setup
time (Ts), the hold time (Th), and the clock-to-output delay (Tc-q) are equal to zero. But,
when the DFF is designed at transistor level, these timing parameters are no longer zero.
Thus, Case 1 is not the possible practically. The timing parameters of the DFF are taken
into consideration in the Case 2, 3, and 4. The Table 3.4 shows that the Case 2 is the best
case among Case 2, 3, and 4 in terms of minimum lock time and minimum peak-to-peak
48
Chapter 4. CDR modeling using Cadence
This chapter presents the paper’s work in transistor level design using 45 nm
technology and modeling of the CDR system using the Verilog-A language in the
Cadence Virtuoso 6.1.5. The designed CDR system is operated at the input data rate of 3
Gbps as shown in Figure 4.1. The two PRBS blocks used at the input of the designed
CDR system consists of the two input data in the PRBS pattern. The block named
“PRBS7-1” consists of the input data, having frequency of 3 Gbps and the other block
named “PRBS7-2” consists of the input data, having frequency of 2.9 Gbps.
The phase detector block is the Alexander PD consisting of the four DFFs and two
XOR gates designed at the transistor level using 45 nm technology. The transistor level
designing of the Alexander PD is explained in detail in section 4.1, 4.2, and 4.3. The
charge pump (CP) block is modeled using the Verilog-A language and consist of one
input variable named Icp (charge pump current). The low pass filter (LPF) block consists
of the resistor (R) connected in series with the capacitor (C1) and both are placed in
parallel to the capacitor (C2). The VCO is also modeled using the Verilog-A language
and has two input variables named fo (oscillating frequency of the VCO) and Kvco (gain
of the VCO). The slicer block is used to convert the sinusoidal output of the VCO into
Finally, the data delay circuit is designed at the transistor level using 45 nm
technology and is explained in detail in section 4.4. The Verilog-A codes and the
49
transistor sizes of the circuits of the designed CDR system are provided in the sections
PRBS
7-1
Input data @ 2.9Gbps
PRBS C2
7 -2
C1
Input data @ 3Gbps
4.1 SDFF
The Semi-dynamic D flip flop (SDFF) is used in the designed CDR system due to the
benefits discussed in section 2.3.1. The SDFF circuit is divided into four parts named A,
50
• Part A is a dynamic inverter-I consisting of one PMOS transistor (M1) in the
pull up network and three NMOS transistors (M2, M3, and M4) connected in
series in the pull down network. The function of the dynamic inverter-I is to
sample the inverted input data (D) on the node X when the clock is in the
evaluation phase.
pull up network and two NMOS transistors (M6 and M7) connected in series in
the pull down network. The function of the dynamic inverter-II is to sample
the inverted value present at node X on the output (Q) when the clock is either
to back configuration. The function of the static keeper is to hold the voltage
of the node X and the output (Q) to the appropriate logic level. Due to the
presence of the two static keepers at node X and output (Q) and the two
dynamic inverters, the DFF is called as the Semi dynamic D flip flop (SDFF).
• Part D is a NAND gate driven by clock signal, which is delayed by the series
combination of the two inverters. A NAND gate along with the series
combination of the two inverters forms the glitch generator circuit. The glitch
51
generator circuit generates the narrow clock pulse around the rising edge of
the clock signal. The period of which corresponds to the total sum of a
NAND gate and the two inverter delays. The purpose of the glitch generator
the noise.
Part C
Clock M1 M5
Q Qbar
Part D X
M2 M
Clock 6
Data M3
M7
Clock
M4
Part B
Part A
52
The working of the SDFF is divided into two phases: the Precharge phase and the
• In the Precharge phase, the clock is at the logic zero; turning on the PMOS
transistor (M1) and turning off the NMOS transistors (M4 and M6). The node
X is charged to VDD (logic one) through the PMOS transistor (M1). As node
X is one of the inputs of the NAND gate, whose other input is zero from the
clock, the NAND gate’s output is set to the logic one and turns on the NMOS
transistor (M2). In precharge phase, the output (Q) is cut off from the first
stage and is held to either a previous or the random value, causing the
Since the drain terminal of the NMOS transistor (M2) is connected to the
node X, the voltage at node X is pulled below the VDD. As the node X drives
the dynamic inverter-II, the load capacitor present at the output (Q) will not
get charged to the VDD and will result in low noise margin. To avoid the
problem of the low noise margin, the static keepers are used at node X and
output (Q) to achieve the rail to rail full supply voltage swing.
• In the Evaluation phase, the clock makes the transition from logic zero to
logic one and turns on the NMOS transistors (M4 and M6) and turns off the
PMOS transistor (M1). As soon as the clock reaches the switching threshold
of the dynamic inverter-II, the NMOS transistor (M6) turns on and the output
53
(Q) will get discharged all the way to GND (logic zero). In evaluation phase,
the circuit behaves as a transparent circuit and the output of the NAND gate
remains at logic one for a short interval of time (corresponding to the total
1) Latching a logic zero: When the input data is at a logic zero, the
NMOS transistor (M3) is turned off and as the clock is high, the output
2) Latching a logic one: When the input data is at logic one, the
NMOS transistor (M3) is turned on. Due to the existence of the direct
path between the node X and the GND, the node X is discharged all
the way to the GND, thus, turning on the PMOS transistor (M5) and
The output waveform of the designed SDFF circuit is shown in Figure 4.3 and the
transistor sizes of the designed SDFF circuit are mentioned in section A.2.1 of the
Appendix. The timing parameters of designed SDFF circuit are presented in Table 4.1.
54
Figure 4.3: Output waveform of the designed SDFF.
The XOR gate accepts two input signals and gives the output as logic one, when both
the inputs have unequal value otherwise it gives output as logic zero. The XOR circuit is
55
A M1 M2 B
- -B -
A M3 M4
-B
A
Output
A
- M5 M6 A
A B -B M7 M8 B
The output waveform of the XOR circuit is shown in Figure 4.5 and the transistor
56
4.3 Alexander PD
The Alexander PD consists of the above designed four DFFs and two XOR gates as
shown in Figure 3.3. The main function of an Alexander PD is to determine whether the
clock is earlier or later than the input data signal as explained in section 3.1.3.
4.4 Inverter
The function of the inverter is to invert the incoming data signal. The inverters are
also used as buffers in various analog and digital circuits. The inverter consists of one
NMOS (M1) and one PMOS (M2) transistor as sown in Figure 4.6 and the output
M2
Input Signal Output Signal
M1
57
Figure 4.7: Output waveform of the designed inverter.
When the input data switches from the low logic level to the high logic level, the time
interval between the input data and the output of the inverter is called low-high
propagation delay and is given by tplh. On the other side, When the input data switches
from the high logic level to the low logic level, the time interval between the input data
and the output of the inverter is called high-low propagation delay and is given by tphl.
To have equal propagation delays (tplh = tphl) and the switching threshold of 0.5 V, the
width of the PMOS transistor (M2) is sized to 1.44 times the width of the NMOS
transistor (M1) as shown in Figure 4.8. The transistor sizes of the inverter are mentioned
58
Figure 4.8: Propagation delay waveform of the designed inverter.
The metastable concept is best explained in the section 2.3. This section presents the
design of the metastable circuit at the transistor level using 45nm technology. The
metastable circuit consists of a glitch generator circuit and a variable clock delay cell.
The glitch generator circuit consists of a variable data delay cell, the chain of inverters,
59
Input Binary Vector
The variable delay cells are widely used in the integrated circuits to delay the active
edge of the clock or of any random signals [15]. The variable data delay cell is designed
using the current starved circuit as shown in Figure 4.10. The figure shows the
controlling transistors (Mn0, Mn1, Mn2, & Mn3, and Mp0, Mp1, Mp2, & Mp3) are turned on at
the source node of the transistors (M1) and (M2) by applying the binary vector to its input
terminal [15]. To achieve a binary incremental delay of 2 ps in the input data, the
controlling transistors are sized in the binary fashion. The pulse width of the data pulse
(metastable window) generated by the glitch generator circuit is varied by six digital bits
(the left and right leg each are varied using three digital bits) as shown in Figures 4.11
and 4.12. The total pulse width obtained due to each binary vector is tabulated in the
Tables 4.2 and 4.3. The transistor sizes of the designed data delay cell are mentioned in
60
Mp0 Mp2 Mp3
Mp1
M2
Input data
Delayed data
M1
61
Table 4.2: Binary vectors to vary the left leg of the metastable window.
0 0 0 38.27
0 0 1 40.37
0 1 0 42.54
0 1 1 44.64
1 0 0 46.25
1 0 1 48.35
1 1 0 44.55
1 1 1 52.62
62
Table 4.3: Binary vectors to vary the right leg of the metastable window.
1 1 1 38.27
1 1 0 40.37
1 0 1 42.54
1 0 0 44.64
0 1 1 46.25
0 1 0 48.35
0 0 1 44.55
0 0 0 52.62
The clock delay cell is designed by using the two inverters connected in back to back
configuration and each inverter contains the PMOS transistor, connected in parallel
configuration as shown in Figure 4.13. The clock delay cell, delays the clock in the
precision of 2 ps by using the five digital bits as shown in Figure 4.14. The total delay in
the clock caused by each binary vector is tabulated in Table 4.4. The transistor sizes of
the clock delay cell are mentioned in section A.2.5 of the Appendix.
63
Input Binary Vector
M1 M2 M3 M4 M5 M6
Delayed Clock
M7'
Clock M7
64
Table 4.4: Binary vectors to vary the clock.
1 1 1 1 1 42.15
1 1 1 1 0 17.88
1 1 1 0 1 25.91
1 1 1 0 0 3.17
1 1 0 1 1 33.85
1 1 0 1 0 11.41
1 1 0 0 1 17.61
1 1 0 0 0 17.88
1 0 1 1 1 37.63
1 0 1 1 0 14.89
1 0 1 0 1 21.39
1 0 1 0 0 17.88
1 0 0 1 1 29.33
1 0 0 1 0 6.59
1 0 0 0 1 13.09
1 0 0 0 0 17.88
0 1 1 1 1 39.54
0 1 1 1 0 18.8
0 1 1 0 1 23.3
65
0 1 1 0 0 17.88
0 1 0 1 1 31.24
0 1 0 1 0 8.5
0 1 0 0 1 15
0 1 0 0 0 17.88
0 0 1 1 1 35.02
0 0 1 1 0 12.28
0 0 1 0 1 18.78
0 0 1 0 0 17.88
0 0 0 1 1 26.72
0 0 0 1 0 17.88
0 0 0 0 1 17.88
0 0 0 0 0 54.41
The designed CDR system without the clock delay cell was simulated for 3 µs with
the frequency step in the input signal from 3 Gbps to 2.9 Gbps at 1 µs. The variables
present in the designed CDR system are initialized as shown in Table 4.5. During the
change in the frequency of the input data from 3 Gbps to 2.9 Gbps, the control voltage
(Vcontrol) of the VCO makes the transition from 500 mV to 300 mV respectively (by
equations 4.1, 4.2, and4.3). The transition of the Vcontrol of the VCO is shown in Figure
4.15 and the eyediagram is shown in Figure 4.16. The lock time observed was 0.32 µs
66
and the peak-to-peak jitter observed in the recovery clock of the designed CDR system
fVCO −fO
Vcontrol = 4.1
KVCO
3 GHz−2.75 GHz
Vcontrol = = 500 mV 4.2
500 MHz/V
Variables Value
Icp 800 µA
R 1 kΩ
C1 20 pF
C2 C1/40
67
Figure 4.15: Simulation result of the designed CDR system.
68
The designed CDR system is simulated with the variable clock delay cell in the
feedback loop as shown in Figure 4.1. The designed CDR system was simulation for 2
The width of the data pulse generated by the glitch generator circuit should be equal
to 44.52 ps (sum of the setup time (Ts) and the hold time (TH), mentioned in Table. 4.1).
From the Tables 4.2 and 4.3, the data pulse with the width of 44.52 ps is achieved by
using the [0 1 0 1 1 1] binary vector. The rising edge of the clock is delayed or aligned
with the data pulse by using a five bit variable clock delay cell as explained in the
following cases:
• Case1: In this case, the designed CDR system was simulated by making the
setup time (Ts) equal to the hold time (Th) of the SDFF. The setup time (Ts) is
made equal to the hold time (Th) by delaying or aligning the rising edge of the
clock to the center of the metastable window as shown in Figure 4.17. Thus,
the rising edge of the clock is delayed by 22.26 ps (half of the data pulse
4.4. Note, this binary vector provides the delay of approximately 25.43 ps,
The peak-to-peak jitter observed in the recovery clock of the designed CDR
69
Ts ≈ Th
Data Ts Th
Clock
• Case2: In this case, the designed CDR circuit was simulated by making the
setup time (Ts) greater than the hold time (Th) of the SDFF. The setup time
(Ts) is made greater than the hold time (Th) by delaying or aligning the rising
edge of the clock as shown in Figure 4.18. Hence, the rising edge of the clock
Table 4.4. The peak-to-peak jitter observed in the recovery clock of the
Ts > Th
Data Ts Th
Clock
• Case3: In this case, the designed CDR system was simulated by making the
hold time (Th) greater than the setup time (Ts) of the SDFF. The hold time
(Th) is made greater than the setup time (Ts) by aligning or delaying the clock
as shown in the Figure 4.19. Hence, the rising edge of the clock is delayed by
70
14.98 ps by using the [1 1 1 1 1] binary vector as shown in Table 4.4. The
Ts < Th
Data Ts Th
Clock
71
Chapter 5. Conclusion
The CDR system was first modeled using the Simulink software and simulated for
three different cases: the equal setup and hold time, the setup time greater than the hold
time, and the hold time greater than the setup time. The results in Table 3.4 show that
when setup time is equal to hold time, the designed CDR system performs best in terms
of minimum lock time and peak-to-peak jitter. The lock time reported in this case was
1.7 µs. The peak-to-peak jitter observed in the recovered clock was 0.04 UI for the 3
Gbps input data and 0.037 UI for the 2.5 Gbps input data.
To validate the observations in Simulink, the CDR system was designed at transistor
Virtuoso 6.1.5. The results obtained from Cadence simulations show that when the setup
time is equal to the hold time, the peak-to-peak jitter observed in the recovered clock was
0.016 UI, which is less as compared to that observed in other two cases (when the setup
time is greater than hold time and hold time is greater than setup time). Thus, the
calibration of the DFF using a metastable circuit improves the lock time and peak-to-peak
Future work involves designing a charge pump and a voltage controlled oscillator at
transistor level using 45 nm technology as per the Verilog A model in Cadence Virtuoso.
Final work will be the fabrication of the designed CDR system onto a chip.
72
References
[1] ITRS, "International Technology Roadmap for Semiconductors 2007 Edition:
Assembly and Packaging,"International Technology Roadmap for
Semiconductors (ITRS), http://www.itrs.net, 2007.
[2] B. Razavi, Design of Integrated Circuits for Optical Communication, 1st Ed., New
York: McGraw-Hill, 2003, Ch. 7-9, pp. 213-329.
[3] B. Razavi, Design of Analog CMOS Integrated Circuits, Int. Ed., Bejing, P.R.
China: Tsinghua University Press, 2001.
[4] David J. Rennie, "Analysis and Design of Robust Multi-Gb/s Clock and Data
Recovery Circuits," Ph.D. dissertation, Dept. Elect. and Comp. Eng., Univ.
Waterloo, ON, 2007.
[6] Liang Dai and R. Harjani, "Design of low-phase-noise CMOS ring oscillator,"
Circuits and Systems II: Analog and Digital Signal Processing, IEEE
Transactions on, vol. 49, no. 5, pp. 328-338, May 2002.
73
[10] M.Rabey, A. Chandrakasan and B. Nikolic, Digital Integrated Circuits,2nd Ed.,
New Jersey, Prentice Hall, pp.332-336, 2003.
[14] Kundert, K.S. Jri Lee, and B Razavi, "Designing Bang-Bang PLLs for Clock and
Data Recovery in Serial Data Transmission Systems," Solid-State Circuits, IEEE
Journal of, vol. 39, no. 9, pp. 1571-1580, Sept. 2004.
74
Appendix
`include "constants.vams"
`include "disciplines.vams"
inputclkp, clkn;
outputoutx, outb;
voltageclkp, clkn, outx, outb;
parameter integer bit_num = 8 from [2:32];
parameter integer seed = 1 from [1:inf];
analog begin
@(initial_step) begin
case (1)
(bit_num == 2): begin a1=0; a2= 1; a3= 0; a4= 0; end
// 2 [0,1]
(bit_num == 3): begin a1=0; a2= 2; a3= 0; a4= 0; end
// 3 [0,2]
(bit_num == 4): begin a1=0; a2= 3; a3= 0; a4= 0; end
// 4 [0,3]
(bit_num == 5): begin a1=1; a2= 4; a3= 0; a4= 0; end
// 5 [1,4]
(bit_num == 6): begin a1=0; a2= 5; a3= 0; a4= 0; end
// 6 [0,5]
(bit_num == 7): begin a1=0; a2= 6; a3= 0; a4= 0; end
// 7 [0,6]
(bit_num == 8): begin a1=1; a2= 2; a3= 3; a4= 7; end
// 8 [1,2,3,7]
(bit_num == 9): begin a1=3; a2= 8; a3= 0; a4= 0; end
// 9 [3,8]
(bit_num == 10): begin a1=2; a2= 9; a3= 0; a4= 0; end
//10 [2,9]
(bit_num == 11): begin a1=1; a2=10; a3= 0; a4= 0; end
//11 [1,10]
75
(bit_num == 12): begin a1=0; a2= 3; a3= 5; a4=11; end
//12 [0,3,5,11]
(bit_num == 13): begin a1=0; a2= 2; a3= 3; a4=12; end
//13 [0,2,3,12]
(bit_num == 14): begin a1=0; a2= 2; a3= 4; a4=13; end
//14 [0,2,4,13]
(bit_num == 15): begin a1=0; a2=14; a3= 0; a4= 0; end
//15 [0,14]
(bit_num == 16): begin a1=1; a2= 2; a3= 4; a4=15; end
//16 [1,2,4,15]
(bit_num == 17): begin a1=2; a2=16; a3= 0; a4= 0; end
//17 [2,16]
(bit_num == 18): begin a1=6; a2=17; a3= 0; a4= 0; end
//18 [6,17]
(bit_num == 19): begin a1=0; a2= 1; a3= 4; a4=18; end
//19 [0,1,4,18]
(bit_num == 20): begin a1=2; a2=19; a3= 0; a4= 0; end
//20 [2,19]
(bit_num == 21): begin a1=1; a2=20; a3= 0; a4= 0; end
//21 [1,20]
(bit_num == 22): begin a1=0; a2=21; a3= 0; a4= 0; end
//22 [0,21]
(bit_num == 23): begin a1=4; a2=22; a3= 0; a4= 0; end
//23 [4,22]
(bit_num == 24): begin a1=0; a2= 2; a3= 3; a4=23; end
//24 [0,2,3,23]
(bit_num == 25): begin a1=7; a2=25; a3= 0; a4= 0; end
//25 [7,25]
(bit_num == 26): begin a1=0; a2= 1; a3= 5; a4=25; end
//26 [0,1,5,25]
(bit_num == 27): begin a1=0; a2= 1; a3= 4; a4=26; end
//27 [0,1,4,26]
(bit_num == 28): begin a1=2; a2=27; a3= 0; a4= 0; end
//28 [2,27]
(bit_num == 29): begin a1=1; a2=28; a3= 0; a4= 0; end
//29 [1,28]
(bit_num == 30): begin a1=0; a2= 3; a3= 5; a4=29; end
//30 [0,3,5,29]
(bit_num == 31): begin a1=2; a2=30; a3= 0; a4= 0; end
//31 [2,30]
(bit_num == 32): begin a1=1; a2= 5; a3= 6; a4=31; end
//32 [1,5,6,31]
default $strobe("Error. Should never get here.");
endcase
mask = pow(2, bit_num) -1;
x = seed;
x = x & mask; //mask the unavailable bit;
end
76
@(cross(V(clkp, clkn), +1, 1p)) begin
b = ((x>>a1)^(x>>a2)^(x>>a3)^(x>>a4))%2;
x = ((x<<1) & (mask-1)) + b;
end
V(outx) <+ x;
V(outb) <+ b;
end
endmodule
A.1.2 Multiplexer
`include "constants.vams"
`include "disciplines.vams"
module Mux(Va, Vb, S, Vo);
inputVa,Vb,S; electrical Va,Vb,S;
output Vo; electrical Vo;
realoutv;
analog begin
if (V(S) > 0.5)
outv = V(Va);
else
outv = V(Vb);
V(Vo) <+ transition(outv,0,1f,1f);
end
endmodule
`include "constants.vams"
`include "disciplines.vams"
77
//electrical subb;
analog begin
subb = V(Up)-V(Dn);
iout = icp*subb;
I(Icp)<+ transition(iout);
end
endmodule
`include "constants.vams"
`include "disciplines.vams"
analog begin
f = f0 + Kvco*V(Vc);
amp = (V(Vdd)-V(Vss))/2;
offset = V(Vss)+amp;
V(Out) <+ amp*sin(2*`M_PI*idtmod(f,0,1))+offset;
end
endmodule
78
A.1.5 Slicer
`include "constants.vams"
`include "disciplines.vams"
analog begin
end
endmodule
79
A.2 Transistor Sizes
M1 4 45
M2 10 45
M3 10 45
M4 10 45
M5 15 45
M6 2 45
M7 2 45
M1 12 45
M2 12 45
M3 12 45
M4 12 45
M5 11 45
M6 11 45
80
M7 11 45
M8 11 45
A.2.3 Inverter
M1 120 45
M2 180 45
M1 2 45
M2 2.5 45
Mn0 0.18 45
Mn1 0.6 45
Mn2 1.2 45
Mn3 0.6 45
Mp0 0.25 45
Mp1 0.86 45
81
Mp2 1.72 45
Mp3 0.86 45
M3 & M3 ’ 0.18 45
M4 & M4 ’ 0.6 45
M5 & M5 ’ 2 45
M6 & M6 ’ 0.6 45
M7 & M7 ’ 6 45
82