0% found this document useful (0 votes)
469 views45 pages

ds811 - dpd-LogiCORE IP Digitlal Pre-Distortion v4

Uploaded by

Gideros
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
469 views45 pages

ds811 - dpd-LogiCORE IP Digitlal Pre-Distortion v4

Uploaded by

Gideros
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

LogiCORE IP

Digital Pre-Distortion v4.0


DS811 September 21, 2010 Product Specification

Introduction • Interface Options


Pre-distortion negates the non-linear effects of a power • Real IF feedback signal sampled at twice the
pre-distortion sample rate with arbitrary IF
amplifier (PA) generated when transmitting a frequency (optimal performance option)
wide-band signal. Pre-distortion allows a PA to achieve
• Real IF feedback signal sampled at one times
greater efficiency by operating at higher output power the pre-distortion sample rate with arbitrary IF
while still maintaining spectral compliance, reducing frequency
system capital and operational expenditure. • Complex baseband feedback signal sampled at
one times the pre-distortion sample rate
The solution is targeted for basestations used in third
and fourth generation (3G/4G) mobile technologies
and beyond. It is a combination of hardware and
embedded software processes that between them real-
ize pre-distortion correction along with features that
make for a fully engineered, practical, robust and
self-contained solution. It is configurable both in fea- LogiCORE IP Facts Table
ture selection and in usage to support a variety of clock- Core Specifics
ing and resource requirements. Supported Virtex-5
Virtex-6
Device Family (1)
Features Spartan-6
Supported User
• Algorithms Interfaces
• DPD correction with up to 33 dB of ACLR Configuration See Resource Utilization and Performance
improvement
Provided with Core
• Pre-distortion correction architecture selection
for cost-performance trade-off Documentation Product Specification

• Dynamics options Design Files Netlist

• TDD support with automatic data selection Example Design Not Provided

• Quadrature modulator correction Test Bench Not Provided

• PA saturation (overdrive) detection Constraints File See Using Constraints

• Signal capture and analysis Simulation


See Running Simulation
Model
• Physical Configuration Parameters
Tested Design Tools
• Selection of correction architectures of
Design Entry
increasing performance/complexity Tools
ISE v12.3 Software

• Selection of polynomial order of 5 or 7 Simulation Mentor Graphics ModelSim v6.5c


• Selection of one, two, four or eight transmit Synthesis Tools Not Provided
antennas
Support
• Clock to sample rate ratios from one to four
Provided by Xilinx, Inc.
• Optional quadrature modulation correction
1. For a complete listing of supported devices, see the release notes
• Optional hardware acceleration of coefficient for this core.
estimation

© Copyright 2010 Xilinx, Inc. XILINX, the Xilinx logo, Virtex, Spartan, ISE and other designated brands included herein are trademarks of Xilinx in the United States
and other countries. All other trademarks are the property of their respective owners.

DS811 September 21, 2010 www.xilinx.com 1


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Applications
An easy-to-use software interface allows configuration, single-stepping and continuous automatic operation while
providing access to signal measurements, data, diagnostic and status information.

Usage Overview
This section briefly summarizes a sequence of events for successful incorporation of DPD into a radio unit FPGA.
Later sections provide the necessary detail.

Instantiation
1. The DPD component is added into the user's HDL code with appropriate clocks and interfacing.
2. DPD is placed after CFR in the transmit chain.
3. The design is compiled.
4. A SW environment for reading and writing the host interface is established.

Basic Operational Checks


1. Read the addresses specified in Table 11 from the host interface; the stated default values should be seen.
2. Execute (for example) the RESET_COEFFICIENTS control mode (see Host Interface and SW Control Modes) to
check termination with successful status.

Software Setup and Signal Validation


1. Set up DPD parameters as described in Setting DPD Parameters.
2. Read the DPD monitors detailed in Table 12.
3. Determine whether the values for the transmit and receive powers are as expected.
4. Perform required operations as detailed in Signal Analysis to ensure that the signal inputs conform to the
recommendations in Factors Influencing Expected Correction Performance.

Pre-distortion Operations and Achieving Performance


1. Adjust DPD parameters and external setup with the aid of the single-stepping commands (see Single Stepping),
external measurements, signal analysis operations and interpretation of diagnostics as required.
2. Run the DCL (see Running the DCL) with diagnostic monitoring to experience the full operational capability of
DPD.

DS811 September 21, 2010 www.xilinx.com 2


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Functional Description
Mathematical Foundation
Digital Pre-Distortion (DPD) acts on transmitted data to cancel the distortion in the PA by implementing an inverse
model of the amplifier. In the conceptual view of Figure 1, the pre-distortion function is applied to the sequence of
(digital) transmitted data x(n). It models the non-linearity of the PA.
The processes involved are the formulation of the model on which the pre-distortion function is based. Estimation
of its parameters is based on samples of the PA input and output. To separate the linear effect of the PA and the cir-
cuitry that drives it, estimation is based on the aligned PA output y(n). The alignment process matches the ampli-
tude, delay and phase variations of y0(n) to z(n). The predistorter is then dedicated to only modeling the non-linear
effects for which it is intended. Alignment and estimation blocks are depicted in Figure 1.
X-Ref Target - Figure 1

Figure 1: DPD Algorithmic View


Volterra Series
The Volterra Series is a well known expansion for non-linear functions in space and time. Here the “space” dimen-
sion is the (complex) signal value and the discrete form in time is used. It is a suitable starting point for a
pre-distortion function. The Volterra Series is given in Equation 1; {h} are constants.

z (n) = ∑ h1 (i) x ( n − i)
i

+ ∑∑ h2 (i1, i 2) x (n − i1)x (n − i 2)
i1 i2
Equation 1
+ ∑∑∑ h3 (i1, i2, i3) x (n − i1)x( n − i2) x (n − i3) Volterra Series
i1 i2 i3

+ .....
Physical Volterra Series
Without loss of generality, the Volterra series can be written as

Equation 2
Q −1
z (n) = ∑ Fq x( n − q) Nonlinear Moving Average
Form of the Volterra Series
q= 0

DS811 September 21, 2010 www.xilinx.com 3


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

In Equation 2, Fq may be called memory terms. If Equation 2 is to model a power amplifier, it must conform to the
boundary condition that when the signal amplitude |x| is small, the model reduces to a linear time invariant sys-
tem (since the PA is linear for small signals). A sufficient condition for this is that the memory terms depends only
on samples of signal magnitude, |x(n - p)| with p <= q.

Volterra Product Selection


There are many possible terms in the Volterra series, and a practical solution involves judicious selection among
them.
Within the physical Volterra series, Fq is in general a function of all samples of |x(n - p)| with p <= q.
Inspection of the physical Volterra series reveals that

Equation 3
Fq = ∑ a{i}q f{i}q Series Expansion
{i} of the Memory Terms
where the set {i} covers the number of indices required to form the terms. Comparison with Equation 1 shows that
the basis functions f{i}q are of the form |x(n - s)|a |x(n - t)|b |x(n - v)|c…. . These can be called Volterra products.
The simplest possible form for f{i}q is unity (a = b = c …. = 0). In this case the series is an FIR filter.
Xilinx DPD versions 1 and 2 used the well known Memory-Polynomial (MP) model. In this model, (i) is one index
k running from 0 to K - 1 and fkq = |x(n - q)|k. This corresponds to selecting diagonal Volterra series terms. The
memoryless pre-distortion model is a special case of this with Q = 1.
In this DPD version, there are options to use the Memory-Polynomial model, but also to configure for models based
on a more general selection of terms. With these configurations, improved correction performance, particularly for
wideband signals, is observed. The pre-distortion correction architecture is an option for core generation and software
configuration. Four possible architectures - called A, B, C and D - can be selected, having increasing complexity of
diagonal and off-diagonal memory terms. In addition the polynomial order can be selected to be 5 or 7. The poly-
nomial order is the maximum value of a,b,c ... appearing in the Volterra products used.

Estimation of the Coefficients


The objective of pre-distortion estimation is to choose coefficients a{i}q such that the PA output y0(n) is as close as pos-
sible to x(n).
This is done by capturing a sequence of L samples of the PA input and output and using Equation 2 in a reverse
sense to make a linear equation for the a{i}q for each sample. A least-squares solution can then be found for the a{i}q.
The least-squares problem for a{i}q can be expressed in matrix form as
Equation 4
Z = UA Parameter Estimation
in Matrix Form

where Z is a column vector of the signal samples z(n), A is a row vector of all the a{i}q and the rows of U are the elab-
oration of all the (|x(n - s)|a |x(n - t)|b |x(n - v)|c…)x(n - q) in the model for each x(n) = y(n), the samples of the
aligned PA output.

DS811 September 21, 2010 www.xilinx.com 4


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Equation 4 can be solved by pre-multiplying each side by UH, the Hermitian transpose of U, to give

Equation 5
VA = W System to be Solved for the
Pre-distortion Coefficients

In Equation 5, V = UHU and W = UHZ. It is a linear system whose solution is the best least-squares estimate for a{i}q
over the sample length L.
An entirely new set of coefficients can be obtained from each new data capture. Alternatively, the coefficient can be
iterated with the Damped-Newton method.
The solution to VA = W can be expressed as A = V \ W. Within this notation, the Damped-Newton method iterates
A according to An+1 = An + μ V \ WE, where WE = UHE and E = Z - UAn. It iteratively acts to minimize an error vec-
tor E, which is the difference between the transmitted samples and the predicted transmission based on the inverse
model over the receive samples. The damping factor μ is an adjustable parameter.
This is a general mathematical method which, when applied to DPD, improves immunity to noise and distributes
noise that is non-uniform in time over many updates, thus making the typical instantaneous behavior equal to the
mean behavior.
The Damped-Newton iteration is used only when the power is stable, as it cannot react to fast dynamics.

System Features
Hardware Description
X-Ref Target - Figure 2

RF
DAC upconverter PA

DUC CFR DPD Datapath Digital Analog

RF
ADC
IQ BB downconverter
data Microblaze
processor observation
subsystem path

Host
interface

DPD
control
Figure 2: Xilinx DPD HW Block View

Figure 2 shows that DPD is placed after CFR in the transmit signal chain. DPD operates at the DPD sample rate fs.
The selection of fs is discussed in Factors Influencing Expected Correction Performance. The signal is converted into
the analog domain by the DAC component. There may be further interpolation stages between DPD and the DAC.
Moreover, there may be digital mixing for a single DAC superheterodyne transmitted or IQ DACs for a direct con-
version transmitter. Whatever choices are made, based on system-level considerations, the net result is that the IQ

DS811 September 21, 2010 www.xilinx.com 5


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

data from DPD eventually appears as modulation of an RF carrier wave at the PA. To estimate pre-distortion coef-
ficients, a sample of the PA output is fed back via the observation path and must finally be presented to the estima-
tor as IQ samples at fs. For optimal sampling bandwidth, either direct RF downconversion followed by an IQ ADC
pair or a heterodyne mixing to fs/4 and an ADC sampling at 2fs should be used. Sub-optimal sampling bandwidth
can also lead to good pre-distortion performance, depending on the signal. DPD supports a single ADC at fs and
variable feedback IF frequency. For single ADC architectures, digital downconversion is required, and this is per-
formed on the data prior to the estimation processing.
Within DPD, the data path contains resources to deal with the real-time processing required for up to eight individ-
ual antennas. A single MicroBlaze™ processor sub-system performs estimation and supporting algorithms – it is
shared between the antennas. Details of the system are given in the following sections.
The host interface is a shared-memory data and message-passing subsystem.

HW-SW Co-design
Xilinx DPD is a combination of HW and SW processes that between them realize the PA distortion inverse model
and the estimation algorithm, as described in the Mathematical Foundation section, along with features that make
for a fully engineered, practical, robust and self-contained solution.
Figure 3 depicts the main elements of the DPD solution. The HW processes are contained within the data path and
the SW processes are run in the MicroBlaze processor code. There are also Quadrature Modulator Correction
(QMC) and Overdrive Detection (ODD) SW processes not indicated in the diagram.
X-Ref Target - Figure 3

Predistortion function and DAC


Tx data from DUC/CFR QMC
ADC

HW processes Measurements Parameter Capture RAM


storage and
HW mapping

SW processes

Control shell, ECF SCA (Sample


Control DCL (Dynamic
debug, (Estimation Capture
interface Control Layer)
monitor Core Function) Acceptance)

Figure 3: HW Data Path and Major SW Processes

Capture RAM and Estimation Core Function (ECF)


The capture RAM collects L complex samples of the transmitted data and the appropriate samples from the obser-
vation path ADC, depending on the receiver configuration.
The ECF performs digital downconversion and then alignment and coefficient estimation as outlined in Mathemat-
ical Foundation. The coefficients are mapped into an efficient structure that avoids direct computation of the
selected Volterra products in hardware.
The ECF performs checks on the signals and reports any adverse status (see Host Interface and SW Control Modes)
if there is a problem, in which case the pre-distortion function parameters are not updated.

DS811 September 21, 2010 www.xilinx.com 6


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Measurements Block and Sample Capture Acceptance (SCA)


The capture RAM is able to capture a maximum of typically tens of microseconds of data. To obtain optimum esti-
mation results, the capture needs to contain data that is representative of the signal over the lifetime of the coeffi-
cients, a time much longer than the capture length. In particular, if the capture is taken during a period of time in
which the signal amplitude is small, for example during the low period of a data pulse, the estimation will be poor
for higher amplitude signals that might occur over a longer timescale. This is similar in principle to fitting a poly-
nomial curve to points in a scatter diagram. The function away from the region where there are data points will be
an extrapolation yielding unpredictable behavior.
In some cases there is a frame structure with a predictably good period that can be used for estimation. For this case,
DPD can be configured to allow the capture to be triggered by an external sync pulse, with a programmable delay.
Otherwise, the SCA feature can be enabled.
SCA takes captures at random and applies acceptance criteria to the samples in the capture RAM to ensure that the
samples are representative of the full signal. The acceptance criteria are based on the average power and statistical
measures (histograms) of the transmitted data in the capture buffers, and their comparison to the average power
and statistics of the transmitted signal over a defined measurement interval (typically greater than 10 ms). The Mea-
surements block depicted in Figure 2 provides the real-time processing of the signals required to form the powers
and histogram. This data is also available to the user via the control interface to aid system debug and monitoring.
When SCA is running, TDD signals are automatically catered for – a capture taken in the uplink period will simply
fail and only data transmitted in the downlink period will ever be used.

Dynamic Control Layer (DCL)


DCL is the mechanism by which DPD adapts to power dynamics encountered in a cell due to call load, reconfigu-
ration or other factors.
DPD must take into account how the correction performance of a real PA depends on the output power dynamic
after some parameters are estimated. If the power steps up after estimation, the correction will typically get worse.
It may also get worse when the power steps down. Therefore it is not simply sufficient to update the coefficient
repeatedly, even if done rapidly, because every time there is a step-up, the correction will be poor until re-estima-
tion. Repeated step-up events will inevitably cause a high integrated out-of-band emission, whatever the estima-
tion rate.
The Xilinx DPD DCL operates by retaining a memory of the behavior of the PA in various signal conditions held in
one or more stored parameter sets. The parameters applied at any one time are determined by criteria that depend
on the momentary signal condition in relation to the signal conditions associated with the stored parameter sets.
The decision process for changing the applied parameters uses the power measurement block in the data path and
happens at the rate set by the METER LENGTH parameter. A typical rate is 10 ms. Single-set and multiple-set
modes are available. The latter should be selected for PAs where the pre-distortion correction gets worse when the
power steps down. Specifically, the user should test by running the DCL in single set mode whilst transmitting ini-
tially at maximum power. If the spectral performance becomes unacceptable when the power is backed-off, multi-
ple set mode can be used. There are configuration parameters for the multiple set mode that are not described in this
data-sheet. If the default operation is still not within bounds, Xilinx Support should be contacted for help with more
detailed settings.

Multipath Handling
The estimation process can update only one path at a time. The multipath DCL attends to each path in turn and esti-
mation is performed only if the power for the port being examined satisfies the criteria for coefficient update (the
same criteria as with the single-port design).

DS811 September 21, 2010 www.xilinx.com 7


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

In the limiting case where only one path ever satisfies the criteria for coefficient update, that path will be continu-
ously re-estimated. This also means that if one path fails, DPD will operate correctly on the others.

Quadrature Modulator Correction (QMC)


Quadrature modulation is a well known and widely deployed technique used to conserve bandwidth for a given
data rate. This is accomplished by modulating two orthogonal data streams onto a common carrier. A highly
attractive method for modulating the carriers is through the use of an analog direct quadrature modulator. These
modulators can be wideband and require relatively slow speed DACs for a given bandwidth. The downside to this
technique is the analog imperfections present in the modulator. If the phases and amplitudes of both branches of the
modulator are perfectly matched, then one of the sidebands is completely cancelled out. Likewise, if there is no DC
bias feed-through, then the carrier itself is completely cancelled out. However, in practice there will always be some
amount of gain and phase imbalance along with a DC bias in analog quadrature modulators, and this can induce
images and spurs in the transmitted spectrum if these fall outside of the occupied band. DPD can be configured to
automatically correct for quadrature imbalance and DC offset in the transmit path, provided that the observation
path does not contain such errors.
An appropriate circuit is included after the pre-distortion function when this option is selected in the core configu-
ration parameters.
When the QMC SW function is enabled, its parameters are estimated iteratively from sample captures that are inter-
leaved with pre-distortion correction estimation.

Overdrive Detection (ODD)


Setting the maximum drive level for an RF power amplifier under operational conditions can be a non-trivial task.
The use of digital pre-distortion allows the PA to be driven such that the peaks of the predistorted waveform can be
driven very close to the saturation point of the PA. However any form of pre-distortion will begin to fail once the
peaks of the waveforms enter the saturating region of the amplifier.
DPD contains a function that dynamically detects when the signal is being driven too hard. The method is predic-
tive, which means that an overdrive condition can be detected before it actually occurs. The method produces a
“soft” overdrive metric so system performance can be tuned to meet the needs of a specific installation.
After each pre-distortion coefficient estimation, ODD computes an over-drive metric. The over-drive metric is com-
pared to an over-drive threshold (which is a user parameter), and the result is used to report an over-drive status for
that estimation, which can be monitored. There is also an option to prevent the coefficients from being used when-
ever overdrive is detected.
Figure 4 shows an example of the over-drive metric while sweeping the RF drive level along with the
corresponding reduction in ACLR achieved by DPD while transmitting a WCDMA waveform. In the region where
the normalized RF drive level is below -20 dB, there is very little distortion from the amplifier, hence no
improvement from the use of DPD. As the RF drive level is increased, the amount of distortion created by the PA
increases. DPD is able to reduce the distortion (by around 22 dB in this case) until the PA begins to be overdriven.
Distortion caused by overdrive cannot be compensated by any form of pre-distortion due to the saturating nature
of the amplifier. The over-drive metric indicates that overdrive is beginning to occur when the normalized RF drive
level reaches around 0 dB. Increasing the RF drive level above this point by just a small amount has a drastic effect
on the ability of the DPD to correct the induced distortion (this is shown in red in Figure 4).

DS811 September 21, 2010 www.xilinx.com 8


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

X-Ref Target - Figure 4

Figure 4: ODD Example Characterization

DS811 September 21, 2010 www.xilinx.com 9


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Core Instantiation and Configuration


DPD GUI Parameters
A screen shot of the core configuration window is shown in Figure 5. When the mouse hovers over each parameter,
tool tips appear with a brief description, as well as feedback about how their values or ranges are affected by other
parameter selections.
X-Ref Target - Figure 5

Figure 5: DPD Core GUI

Component Name
Enter the name of the core component to be instantiated. The name must begin with a letter and be composed of the
following characters: a to z, A to Z, 0 to 9, and '_'.

Number of Antennas (TX)


One, two, four or eight transmitter antenna paths can be included in an instance of this IP core. Only one, two and
four transmit antenna paths are selectable with Spartan-6 devices. Each core instance includes only one MicroBlaze
processor. Multiple instances may also be used for improved dynamics performance.

Clocks Per Sample (CLOCKS_PER_SAMPLE)


This core offers the user the option to select different hardware folding options by setting this parameter to either 1,
2, 3 or 4. When the parameter is set to 1, the entire design runs at the same rate as the input sample rate. This permits
very high sample rates to be achieved - up to 300 Msps on Virtex-6, capable of handling 60 MHz of signal
bandwidth. When the parameter is set to 2, 3 or 4, various portions of the design take advantage of hardware
folding to offer a resource saving in the hardware implementation. The Resource Utilization section provides a

DS811 September 21, 2010 www.xilinx.com 10


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

subset of the resource utilization data for different parameter combinations of the core. Similarly, the IP Timing
Performance provides some timing performance data. The spreadsheet included in the doc directory of the zip file
contains data for all parameter combinations.

Performance Architecture (ARCH)


The architecture configuration of the IP is chosen to suit performance requirements and selected device capacity.
Architecture A is the memory-polynomial model (see Volterra Product Selection). Architectures B, C and D
typically offer distortion correction performance improvement (but at increasing resource cost), dependent upon
the radio platform and signal characteristics. Users may want to set this to be D during a performance evaluation
phase if they can fit the design in their chosen devices. Users then have an option to evaluate the performance using
software controls as described in Operations Guide for smaller architectures A, B and C. For example, if the
performance of architecture C is satisfactory, resources can be saved by regenerating the core with option C.

Polynomial Order (POLY_ORDER)


This parameter sets the maximum polynomial order, as defined in the Functional Description. When set to 7 in the
GUI, either 7 or 5 can be selected through the software interface, as described in the Operations Guide.
The primary effect on resource of setting POLY_ORDER to 7 is to increase the number of BRAM memories.

Quadrature Modulation Correction (QMC)


If the quadrature modulation correction capability is not required, resources can be saved by setting this parameter
to false.

Hardware Acceleration (HWA)


When this checkbox is selected, it enables hardware acceleration logic within the core to accelerate the coefficient
computation. This adds additional resources and should be disabled if the increased speed is not required. When
this parameter is set to true, a separate input clock accel_clk is available on the netlist. The IP Timing
Performance provides timing data for this clock.

DS811 September 21, 2010 www.xilinx.com 11


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Input/Output Ports
Figure 6 displays the signal names; Table 1 defines these signals.
X-Ref Target - Figure 6

clk
ceN_out
rst
proc_clk
proc_rst
accel_clk

ce_clr
din_i dout_i
din_q dout _q

srx_din0
srx_path_sel
srx_din1

capture_sync

host_interface _clk
host_interface _addr
host_interface _din host _interface_dout
host_interface _we

Figure 6: DPD IP Symbol (Input/Output Ports)

Table 1: Top-Level I/Os for the DPD IP Core


Port Names Direction Width Notes
clk In 1 Input clock to data path.
rst In 1 Active high reset to data path, internally synchronized to clk/ceN_out.
proc_clk In 1 Input clock to MicroBlaze processor system.
proc_rst In 1 Active high reset to MicroBlaze processor system, internally synchronized to
proc_clk.
accel_clk In 1 Optional clock input port for hardware acceleration unit. Only available when
this feature is enabled.
ce_clr In 1 Clock enable clear signal, typically driven by inverted dcm/pll lock signal.
ceN_out Out 1 Clock enable, active high once every N cycles where N is CPS.
din_i In 16*TX 16-bit real input per transmit path. Least significant 16 bits correspond to
transmit path 0 followed by 16-bits for each additional path present.
din_q In 16*TX 16-bit imaginary input per transmit path. Least significant 16 bits correspond
to transmit path 0 followed by 16-bits for each additional path present.
dout_i Out 16*TX 16-bit real output per transmit path. Least significant 16 bits correspond to
transmit path 0 followed by 16-bits for each additional path present.

DS811 September 21, 2010 www.xilinx.com 12


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Table 1: Top-Level I/Os for the DPD IP Core (Cont’d)


Port Names Direction Width Notes
dout_q Out 16*TX 16-bit imaginary output per transmit path. Least significant 16 bits
correspond to transmit path 0 followed by 16-bits for each additional path
present.
capture_sync In 1 Signal to optionally synchronize pre-distortion capture.
srx_din0 In 16 Mode-dependent 16-bit receive data path inputs.
srx_din1 In 16
srx_path_sel Out 1/2/3 Control for receive path selection in multiple-path designs; bitwidth is 1, 1, 2
or 3 when TX is set to 1, 2, 4 or 8 respectively.
gp_input In 32 Reserved. Connect to any constant value, for example x"9876FACE".
gp_output Out 32 Reserved.
host_interface_clk In 1 Host interface RAM access clock.
host_interface_we In 1 Host interface RAM active high write enable.
host_interface_din In 32 Host interface RAM input data.
host_interface_addr In 9 Host interface RAM address.
host_interface_dout Out 32 Host interface RAM output data.

Interfaces
The DPD IP core requires various interfaces to user logic for successful operation of the design. In this section, the
clock, reset, data path, and host access interfaces are explained in detail.

Clock Interface
This core requires up to four clock input signals.

clk
Signal clk is the primary clock used to drive all the data path logic within the design. Its clock frequency will be
CLOCKS_PER_SAMPLE times the input data rate of the DPD IP core. The input data rate will also be the sample
rate at the output of DPD.

ce_clr and ceN_out


The ce_clr signal is provided to allow the user to synchronize the clock enable generation logic to a stable phase
of the clk signal. When a user is generating the clk signal using a DCM or PLL, the locked output signal from
these components can be inverted and connected to ce_clr port. When ce_clr is asserted, ceN_out is reset and
there is no toggling. Once ce_clr is released, ceN_out starts toggling again. ceN_out is synchronous to the clk
signal and will be asserted for a single clock cycle every CLOCKS_PER_SAMPLE cycles, providing a write/read
strobe for the input and output data samples. It is recommended that this signal is used for logic that is synchronous
to the DPD IP I/Os. The MAX_FANOUT attribute is applied to the internal version of this signal and the value of
the attribute is set to REDUCE. This instructs the map tools to employ physical synthesis rules to reduce the fan-out
as per the requirements of the timing constraints. To assist in this, map should be run with the -register_duplication
switch turned on. When the DPD design is incorporated with user logic, forcing the map tool to ignore hierarchy
will assist in controlling ceN_out fan-out and the user does not have to manage it. Where clock-domain crossing is
required, ceN_out can be used for FIFO read/write enables. Any cross-clock domain signal that interacts with the
logic in proc_clk interacts using dual-port block RAMs or with dual-level hold registers, and wherever necessary
handshaking with software is employed to ensure data stability during the period of interest for the software and
hardware system.

DS811 September 21, 2010 www.xilinx.com 13


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

proc_clk
Signal proc_clk is the processor clock used to drive all logic connected to the MicroBlaze subsystem. This clock
can be generated such that it is synchronous to the clk signal; however, it can also be asynchronous, since the
design has all required handshaking to ensure valid data transfer between domains.

accel_clk
Signal accel_clk is available when HWA is enabled. There is no particular minimum frequency to run this signal
at; however, the higher the clock frequency is, the faster the computations are performed. Running at the same rate
as the proc_clk should give acceptable acceleration. The effect of acceleration is calibrated in the SW Features
Timing Performance section. IP Timing Performance section discusses typical maximum speeds for this clock.

host_interface_clk
The signal host_interface_clk is an additional clock input that is used to access the host interface memory for
passing instructions to, and reading status from the DPD IP MicroBlaze processor.
The clk, proc_clk and accel_clk signals should be on the global clock network of the device using global
buffers BUFGCTRL. The host_interface_clk has only one load which is a 512x36K block RAM. Hence the DPD
IP does not require this clock to be on the global network; however it is expected that this interface will be accessed
by the user's onboard processor or on-fpga processor. Hence it makes sense to drive this clock signal from a global
buffer that is created within the user logic.
These clocks can be generated using a combination of DCMs, PLLs and BUFGs, but care has to be taken to ensure
that jitter specifications are used while constraining the clock signals. The actual frequency at which these clocks
can run eventually depends upon other user logic, placement and timing constraints. For stand-alone
characterization of typical clock frequency at which clk, proc_clk and accel_clk in DPD IP can run, see IP
Timing Performance.
The IP core does not contain any global clock resources like global buffers (BUFGs) and input global buffers
(IBUFG/IBUFGDS). When the user instantiates the design, these clocks should be driven from a global clock buffer
(any of the variants of BUFGCTRL) external to the IP core. If the user chooses to let the synthesis tool infer the global
buffers, care should be taken to ensure that each of the clock inputs have the expected clock buffers. The synthesis
tool may not recognize that these ports, which are connected to the black box DPD netlists, are in fact clock input
ports. In such a case, cascaded BUFGs can get created. When a BUFG drives another BUFG using local routing
instead of global clock routing resources, excessive clock skews can occur. This can cause various setup and hold
violations in static timing analysis.
It is recommended that only stable clocks be connected to the IP when using DCM and PLL to derive the clocks. In
that case, BUFGCE (or BUFGMUX) can be used, and lock signal can be used to derive the enable signal to connect to
BUFGCE CE port.

Reset Interface
The two reset signals rst and proc_rst have active high sensitivity. In the design, rst is registered using clk and
ceN_out, while proc_rst is registered using proc_clk; each of the resets is then applied to internal logic within
their synchronized domain only. These registers help with the routing of the reset signal throughout the design. The
synchronizing register also helps keep the fan-out management local to the design. MAX_FANOUT attribute is
applied to the internal version of the reset signals, and the value of the attribute is set to REDUCE. This instructs the
map tool to employ physical synthesis rules to reduce their fan-out as per the requirements of the timing
constraints. To assist in this, map should be run with -register_duplication turned on. The resets can be wired
to a register that appears in the memory map interface of the on-board processor, thus facilitating a software reset
feature. This usage, although not mandatory, is recommended.

DS811 September 21, 2010 www.xilinx.com 14


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

It is recommended that the user connect some form of reset to PLLs and DCMs generating the DPD design clock to
ensure that there is a known startup sequence, and ensure that the DCMs and PLLs have locked before
de-activating the rst and proc_rst signals. It is recommended that the user also apply a new logical reset pulse
to both these signals once the clocks have stabilized.

Data Path Interface


The complex input pair, din_i and din_q, comprise a 16-bit bus for every transmit path present. Bits[15:0] corre-
spond to path 0, Bits[31:16] to path 1, and so on. The user may choose to connect wider or narrower bus width sig-
nals to these input ports based on an understanding of the signal statistics and dynamics. If the input bit-width is
shorter than 16 bits, it should be zero-padded at the LSB; if the input bitwidth is wider, it should be symmetrically
rounded before connecting to the din_i and din_q signals for that particular transmit path. The complex output
pair (dout_i and dout_q) contain a corresponding 16-bit bus for each transmit path. The user can connect them to
a digital mixer or to a complex DAC. If the TX DAC has a lower bitwidth (14 bits, for example), symmetric rounding
should be applied to convert the signals from 16 to 14 bits. Symmetric rounding is required to prevent any
unwanted DC offset, which will introduce a spur in the transmitted spectrum in cases where the signal itself has no
DC energy, for example WCDMA carrier configurations [0 0 1] and [1 0 1].
There is one additional control signal, ceN_out, that should be used in conjunction with the din_i and din_q
input signals. The signal ceN_out is a clock enable signal that is high once every N cycles, where N is
CLOCKS_PER_SAMPLE parameter value, thus matching the input data rate. This signal should be used as clock
enable in various portions of user logic that creates the input data. This signal should be used such that the
din_i/q signals are generated as shown in Figure 7. The figure shows that ceN_out is used as a clock enable to
generate the din_i/q signals and, hence, when a rising edge of clk occurs and ceN_out is high, the din_i and
din_q values change.
The Using Constraints section discusses various timing constraints that help achieve static timing closure quickly.
These constraints cover multi-cycle path and cross-clock domain constraints.
Figure 7 shows the timing diagram of the din/dout signals in relation to ceN_out.
X-Ref Target - Figure 7

Figure 7: Data Path Interface Timing Diagram

Capture Synchronization
In capture mode 1 (see Setting DPD Parameters section) the capture is triggered when the input signal
capture_sync is held high during an active rising edge of clk as qualified by ceN_out.

Latency
If the system integration requires an understanding of the latency through the core, refer to the end-to-end latency
in data samples or ceN_out cycles for various parameter combinations, listed in Table 2.

DS811 September 21, 2010 www.xilinx.com 15


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Table 2: Latency per Transmit Antenna Path with QMC=FALSE (independent of POLY_ORDER and HWA)

Performance CLOCKS_PER_SAMPLE
Architecture 1 2 3 4
A 23 22 19 19
B 24 23 20 20
C 25 24 21 21
D 25 24 22 21

When QMC is enabled, latency per path increases by 4 data samples or ceN_out cycles.

Sample Receiver (SRx) Interface


There are two 16-bit input data ports – srx_din0 and srx_din1. These signals should be synchronized by user
logic to clk and ceN_out clock/clock enable signals. The two signals allow the user to provide received (feedback)
data back into core; this should be presented in one of three formats determined by the RXINPUTFORMAT SW
parameter value – 0, 1 or 2 (see Setting DPD Parameters and Sample Rates).

Table 3: SRx Interface Signal Connection


RXINPUTFORMAT[107] srx_din0 srx_din1 Notes
Combines to produce a data stream at rate 2fs in
0 (2fs) S0, S2, S4,… S1, S3, S5,… which an srx_din0 datum should precede an
srx_din1 datum
The SPECTRALINVERSION SW parameter may
1 (IQ) real imag
need to be adjusted
2 (fs) 0 ADC data

The 2fs rate data generated from ADCs can be converted to srx_din0/srx_din1 signals by either taking advan-
tage of interleaving capabilities of various ADCs or, if the ADC is running at twice the rate, using ODDRs, a
dual-aspect FIFO or a simple demux circuit with a FIFO to provide the srx_din0 and srx_din1 with appropriate
data.
The fs rate data or complex IQ data from ADCs can be wired into the DPD ports using a simple asynchronous FIFO
to transfer data from ADC clock domain to the clk domain.

Host Interface
The Hardware Description section describes briefly the host interface and how it is tied into the MicroBlaze proces-
sor design. This interface allows the user to provide settings and instructions to the MicroBlaze processor about the
various functions it needs to perform. This interface uses a 512x32 block RAM in dual-port mode. The
host_interface_* ports are accessible to the user at the top-level to access one port of this memory.
The Operations Guide section provides the detail and usage of the memory map of this interface.
The hardware aspect of this interface is the same as accessing a port of any dual-port memory, and the user should
drive the host_interface_din/addr/we synchronously to host_interface_clk. The
host_interface_dout is output from the memory addressed by host_interface_addr when
host_interface_we = '0'. The block RAM has “WRITE_FIRST” mode set on the port. The port also has a latency
of 2 for read operations. Figure 8 shows the timing relationship between the host_interface signals.

DS811 September 21, 2010 www.xilinx.com 16


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

X-Ref Target - Figure 8

Figure 8: Host Interface Signal Timing Diagram

Multiple Antenna Interface


Table 4: Bit Selection indices for multiple antenna interface
TX index 0 1 2 3 4 5 6 7
{din,dout}_{i,q} bit indices [15:0] [31:16] [47:32] [63:48] [79:64] [95:80] [111:96] [127:112]

For the SRx port, there is a single set of srx_din0 and srx_din1. The MicroBlaze processor looks at the srx_din
ports and assumes that they are related to path 0, 1, 2, 3, 4, 5, 6 or 7 when srx_path_sel is driven to 0, 1, 2, 3, 4, 5,
6 or 7 respectively. If the number of paths is set to 1, users can leave the srx_path_sel output port unconnected.
If there are independent ADCs for all antennas, the switching should be done in the FPGA using srx_path_sel.
If the board has an RF switch to select between the output of the various PAs and then send the signal to a single
ADC, srx_path_sel should be forwarded to that RF switch. If direct hardware connection to the RF switch or
mux is not possible, the Antenna Selection Options in a Multipath Installation section should be consulted.

Using Design Files


Instantiate Netlist
Users can generate the core with appropriate parameters (see DPD GUI Parameters). According to the coregen
project settings, a netlist with vho/veo and vhd/v files are generated in the project directory with the file names
matching the component name value. The user can instantiate the netlist file in their code following the template
shown in the vho/veo file. Users may also use the vhd/v file for simulation purposes; these files are machine
generated using UniSim library components. The netlist and vhd/v files contain memory with pre-initialized
software code to perform the pre-distortion functionality.

Using Constraints
Users should take advantage of multi-cycle path constraints and cross-clock domain constraints to get better and
faster static timing closure on their design. The DPD core is optimized to benefit from this. Users can add period
constraints mentioned in Table 5 if the clocks are independently generated. The period constraints may get derived
by the tools, if these clocks are generated from DCM/PLL/MMCM. The NET names used in this table should be
updated with the appropriate signal name as connected to the DPD core’s netlist clock inputs. Users should replace
the string "<*CLK_PERIOD>" with appropriate period values in nanoseconds.

DS811 September 21, 2010 www.xilinx.com 17


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Table 5: Example Period Constraints


# FPGA System Clock Input
NET "clk" TNM_NET = "TNM_clk";
TIMESPEC "TS_clk" = PERIOD "TNM_clk" <CLK_PERIOD> ns HIGH 50%;

# PROCESSOR CLOCK INPUT


NET "proc_clk" TNM_NET = "TNM_proc_clk";
TIMESPEC "TS_proc_clk" = PERIOD "TNM_proc_clk" <PROC_CLK_PERIOD> ns HIGH 50%;

# HWA Clock Input - Uncomment the next two lines, if HWA is selected
# NET "accel_clk" TNM_NET = "TNM_accel_clk";
# TIMESPEC "TS_accel_clk" = PERIOD "TNM_accel_clk" <ACCEL_CLK_PERIOD> ns HIGH 50%;

# HOST INTERFACE CLOCK INPUT


NET "host_interface_clk" TNM_NET = "TNM_hi_clk";
TIMESPEC "TS_hi_clk" = PERIOD "TNM_hi_clk" <HI_CLK_PERIOD> ns HIGH 50%;

For multi-cycle path constraints, ceN_out is available as an output port on the DPD netlist allowing users to create
a multi-cycle path constraint as shown in Table 6. Net name should be replaced with the signal name used to
connect to ceN_out output port when instantiating the DPD core. Users should replace the string
"<CEN_PERIOD>" with appropriate period values in nanoseconds. It is typically <CLK_PERIOD> value replaced
in Table 5, multiplied by CLOCKS_PER_SAMPLE.

Table 6: Example Multi-cycle path constraints


# If instantiation is as follows at the topmost level of user code,

# dpd_inst : dpd_v4_0_component_name
# port map ( ….

# ceN_out => dpd_ce_out,


#... );

# Multi-cycle path constraint (uncomment the next two lines, when CLOCKS_PER_SAMPLE=2,3 or 4)
# NET "dpd_ce_out" TNM_NET = "ce_N_group";
# TIMESPEC "TS_ce_N_group_to_ce_N_group" = FROM "ce_N_group" TO "ce_N_group" <CEN_PERIOD> ns;

#If the DPD instantiation is at a lower hierarchy, then update the NET name to reflect hierarchy.

Since the DPD IP uses up to 4 clocks and one multi-cycle path domain, it is useful to exploit the robustness in
cross-clock domain crossing logic within the DPD design, by adding the cross-clock domain constraints as shown in
Table 7. These rely on the timing groups defined in Table 5, so users should carefully replace the timing group
names if they don't match in their environment. Apply caution when using TIG constraints and evaluate whether
any of the user code (non dpd netlist) is getting covered by these timing groups and timespec constraints, and if so,
verify that TIG constraints can be validly applied to the user logic as well. Please consult Xilinx Support if you are
concerned about the applicability of these constraints in your design.

DS811 September 21, 2010 www.xilinx.com 18


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Table 7: Example Cross-Clock Domain constraints


#CROSS CLOCK Domain TIG Constraints (uncomment the next 4 lines, if CLOCKS_PER_SAMPLE =2,3 or 4)
# TIMESPEC "TS_ce_N_to_proc_clk_group" = FROM "ce_N_group" TO "TNM_proc_clk" TIG;
# TIMESPEC "TS_proc_clk_to_ce_N_group" = FROM "TNM_proc_clk" TO "ce_N_group" TIG;
# TIMESPEC "TS_ce_N_to_hi_clk_group" = FROM "ce_N_group" TO "TNM_hi_clk" TIG;
# TIMESPEC "TS_hi_clk_to_ce_N_group" = FROM "TNM_hi_clk" TO "ce_N_group" TIG;

#Uncomment the next two lines if HWA is enabled in the design


# TIMESPEC "TS_proc_clk_to_accel_clk_group" = FROM "TNM_proc_clk" TO "TNM_accel_clk" TIG;
# TIMESPEC "TS_accel_clk_to_proc_clk_group" = FROM "TNM_accel_clk" TO "TNM_proc_clk" TIG;

Running Ngdbuild, Map and Par Tools


Regarding map tool options, it is recommended that the user set the following options in map.
• register_duplication on
• ignore_keep_hierarchy
The register_duplication switch is set to on and is essential to get the tools to manage the fan-out on high fan-out
signals within the DPD IP (for example, ceN_out and resets). The ignore_keep_hierarchy option in map is essential
to allow better optimization for high fan-out signals when they are used outside the DPD IP as well as improving
the resource utilization of the logic across hierarchy. It is not required for the purpose of ignoring the hierarchy in
the ngdbuild tool.
When running map and par, when timing closure becomes difficult for congested designs, it is recommended that
the user try -xe n option (For Extra Effort Level) to get map and par to employ additional algorithms to aid in timing
closure.

Running Simulation
Users have the option to simulate the netlist generated, using the unisim based model provided during generation.
Users should ensure that precompiled unisim libraries are available for the correct simulation tool version before
proceeding to simulate.
A typical simulation testbench includes:
• Clock source generators
• Reset generators (ensure ce_clr is de-asserted first, before rst and proc_rst are de-asserted)
• Data generators - din_* and srx_din*, capture_sync signals (ensure these are generated and fed
synchronous to clk and ceN_out)
• Data collectors - dout*, srx_path_sel signals (ensure these are sampled and collected synchronous to clk
and ceN_out)
• Host interface drivers (ensure that this interface is synchronous to host_interface_clk)
As the DPD netlist includes a processor and software code, simulation will be slow. When QMC is not incorporated
in the netlist, the DPD netlist exhibits pass through behavior after reset is released; the latency of the pass through
is described in the Latency section. When QMC is enabled, the DPD netlist requires the internal MicroBlaze
processor to initialize QMC; this introduces a long period between reset being released and the design exhibiting
pass through behavior. During the initialization period, the DPD output may be undefined and it can take up to
25000 proc_clk cycles before input data is passed to the output with no modification. The actual number will be
lower for smaller values of TX and ARCH and when HWA is absent.
To exercise the host interface with various commands, the Operations Guide section should be referred to, for
correct interaction procedure.

DS811 September 21, 2010 www.xilinx.com 19


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Operations Guide
Host Interface and SW Control Modes
DPD is controlled via the host interface RAM (see Host Interface). It is a port of a shared memory in the MicroBlaze
processor subsystem that will typically be connected to a microcontroller bus in the control plane of the application.
DPD operations are activated by writing data to addresses in the RAM. Status, results, diagnostics and data are
accessed by reading data from addresses.
The host interface memory map is organized into the regions as shown in Table 8. In what follows, individual
addresses are specified. Addresses not explicitly referenced should be considered as reserved except for the range
216 to 255, which should be considered unused. For ease of reference and support, upper-case mnemonics are
defined for key addresses, parameters, control modes and values. Where necessary, the associated value is shown
in braces following the mnemonic.

Table 8: DPD Host Interface Memory Map Outline


Address Range Usage
0 to 15 Control
16 to 95 Continuous Monitors (read only)
96 to 215 Parameters (write only)
256-511 DCL Diagnostics/Page Transfer (read only)

DPD features are executed via triggering control modes. They are like function calls with optional parameters that
influence the behavior of the data path and internal state, in addition to returning results. Control modes are pro-
vided that allow the user to configure DPD, run single-step estimation, run the DCL and access measurements, read
data and status information for setup, debug, and monitoring. Table 9 details the general registers involved with
control modes.

Table 9: DPD Host Interface Control Registers


Address Mnemonic Description
A control mode is triggered by toggling this register from zero to
0 CONTROLMODETRIGGER
0xABCDEF12
1 CONTROLMODEREGISTER The number of the control mode to execute
2 PORTNUM Specify the port to which to apply command
3 WAITINGONPORTSWITCH See Antenna Selection Options in a Multipath Installation
8 COMMANDSTATUS See Table 10
Echoes CONTROLMODEREGISTER on completion of the control
9 EXECUTEDCOMMAND
mode
11 CODEPOINTER See Antenna Selection Options in a Multipath Installation
12 ACTIVEPORT See Antenna Selection Options in a Multipath Installation
13 EXECUTINGCOMMAND Reports the currently executing command

DS811 September 21, 2010 www.xilinx.com 20


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

The necessary sequence of events is as follows:


1. Write the number of the required control mode to CONTROLMODEREGISTER.
2. Write the port (antenna) number (0 to 7), if required, to PORTNUM.
3. Write optional parameters as specified in Setting DPD Parameters
4. Trigger the control mode via CONTROLMODETRIGGER as specified.
Excepting DCL modes, modes terminate and return a status value in COMMANDSTATUS. Possible returned
values are given in Table 10, referenced to the control modes involved.
Table 10: Status Values
Relevant Control Mode(s) Value Mnemonic Description
The evaluation license has
ALL -511 EVAL_LICENSE_TIMEOUT
timed out
The number of samples to
UPDATE_ECF_PARAMETERS(17) -126 CONFIG_FAILURE_L process is outside the valid
range
The minimum delay is greater
UPDATE_ECF_PARAMETERS(17) -125 CONFIG_FAILURE_DELAY than the maximum delay
requested
The meter length is out of the
SET_METER_LENGTH(6) -124 CONFIG_METER_LENGTH
valid range
UPDATE_ECF_PARAMETERS(17) -123 CONFIG_FAILURE_ARCH Invalid architecture selection
The maximum delay search
UPDATE_ECF_PARAMETERS(17) -121 CONFIG_FAILURE_DELAY_WIN_SIZE
size is too big
COMPUTE_NEW_COEFFICIENTS(2) -120 ALIGN_FAILURE Signal alignment failure
COMPUTE_NEW_COEFFICIENTS(2) -119 COEF_OVERFLOW Internal coefficient overflow
A numerical issue was
COMPUTE_NEW_COEFFICIENTS(2) -115 LEASTSQUARES_FAILURE encountered during
coefficient estimation
Failure to capture a
COMPUTE_NEW_COEFFICIENTS(2) -114 SCA_FAILURE statistically sufficient set of
samples
Failure to capture new
COMPUTE_NEW_COEFFICIENTS(2) -113 CAPTURE_FAILURE
samples
ALL -112 INVALID_PORT Invalid port was selected
GET_HISTOGRAM_PAGE[5],
-111 HISTOGRAM_FAILURE Histogram failure
GET_CAPTURE_HISTOGRAM_PAGE[15]
An invalid command was
ALL -1 INVALID_COMMAND
requested
ALL 0 ZERO Undefined
The processor is busy
ALL 1 COMMAND_IN_PROGRESS executing a requested
command
Successful completion of the
ALL 2 SUCCESSFUL
requested command
Overdrive was detected for
COMPUTE_NEW_COEFFICIENTS(2) 255 OVERDRIVE_DETECTED the current ECF update, the
coefficients are NOT blocked
Overdrive was detected for
the current ECF update, the
COMPUTE_NEW_COEFFICIENTS(2) 256 OVERDRIVE_PROTECTED
coefficients ARE blocked - not
switched into the data path

DS811 September 21, 2010 www.xilinx.com 21


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Setting DPD Parameters


The various parameters that control DPD are detailed in Table 11. Some parameters need to be set to suit the instal-
lation, particularly for the observation path configuration. For other parameters, the defaults will usually provide
good performance. Except where indicated, values are unsigned integers written to 32-bit registers. U32.x represent
unsigned fractional values with x bits after the binary point.

Table 11: DPD Parameters


Address Mnemonic Values Default Notes
Number of samples to use for the DPD update
algorithm; may be reduced to speed up
96 SAMPLES2PROCESS 400 to 4095 4000
estimation times (see SW Features Timing
Performance) if performance is assured.
99 ARCH_SEL see note [1] see note [1] see note [1]
102 ODDTHRESHOLD 0 to1 (U32.16) 0.9 Threshold to declare overdrive detection.
When ODP is enabled, coefficient sets that are
103 ODPENABLE 0 or 1 0 predicted to cause an overdrive condition will not
be switched into the data path.
Set to zero for unconditional Least-Squares
updates. Set between 0 and 1 for damped
updates. The DCL will allow damped updates
0 to 0.999
104 DAMPEDNEWTONMU 0.1 only when the power and loop gain is stable.
(U32.32)
Smaller non-zero values will cause slower
tracking with better suppressing of fluctuations at
the cost of slower convergence.
When set to 1, this inverts the receive spectrum.
106 SPECTRALINVERSION 0 or 1 0 Needed for low-side RF LO in the observation
path.
0 – the receiver is supplying real IF data at 2*fs.
1 – the receiver is supplying IQ baseband data at
107 RXINPUTFORMAT 0 or 1 or 2 0
fs.
2 – the receiver is supplying real IF data at fs.
0 to 0.5 0.25
108 RXPHASESTEP (IF frequency/fs)*2^32 for real IF receiver modes.
(U32.32) (0x40000000)
Minimum and maximum delay to search over
200>=
during initial coarse delay adjustment. Here n
(max - min) max 200,
109 + n MINMAX_DELAY_ADJ ranges from 0 to the number of ports minus one.
>=32 min 0
The values should be provided as max*2^16 +
(2x U16.0 packed)
min.
Capture mode.
0 – use SCA
117 CAPTUREMODE* 0 or 1 0
1 – capture at fixed delay from supplied
capture_sync signal
Capture delay in samples at fs associated with
118 CAPTUREDELAY* 0 to 2^32-1 15000
capture mode 1.
Number of samples for measurements block
119 METERLENGTH** 2^18 to 2^24-1 1228800 processing. Should be a multiple of any frame
size and >= 10 ms.
0 – single set mode
134 DCL_MODE*** 0 or 1 0
1 – multiple set mode

DS811 September 21, 2010 www.xilinx.com 22


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Parameters may be written into the host interface RAM at any time, but are not activated until a control mode is
executed. For unmarked items the command mode is UPDATE_ECF_PARAMETERS(17);
otherwise the command mode is:
* - SET_CAPTURE_PARAMS(11),** - SET_METER_LENGTH(6), *** - SET_DCL_PARAMETERS(12).
1. When the core is instantiated, the performance architecture GUI selection sets the hardware provision for the pre-distortion
function. By default, the ECF recognizes this and computes the appropriate number of coefficients. However, for evaluation
purposes the ECF can use a lesser degree than provisioned in the core. ARCH_SEL in Table 11 may be set using the
following relationship to enable non-default configurations.
ARCH_SEL = x + ( y – 5 ) × 2 3

where x is 1, 2, 3 or 4 representing ARCH = A, B, C and D respectively and y is the polynomial value POLY_ORDER = 5 or 7.
For example D/7th order is ARCH_SEL =20 and C/5th order is ARCH_SEL =3.
The higher values of ARCH_SEL are normally beneficial only for signals where the occupied bandwidth is greater than
around one sixth to one seventh of the pre-distortion bandwidth fs, and ARCH_SEL should be set to the minimum value
required to achieve best performance. See Sample Rates.

Monitors
Various information is always available from the host interface RAM. This is detailed in Table 12.

Table 12: Monitors


Address Mnemonic Description
34 VERSION_0
Information to be supplied when contacting Xilinx Support
35 VERSION_1
32-bit LSW of a time monitor that counts the number of measurement
48 RUNTIMELSW intervals. The timer is reset when the meter length is updated or when
the DCL parameters are set.
49 RUNTIMEMSW MSW of the time monitor
50 SRXLSW 32-bit LSW of the receiver power
51 SRXMSW 32-bit MSW of the receiver power
32-bit LSW of the transmit power of port n (where it ranges from 0 to
52 + 5n TXPOWERLSW
number of ports minus one)
32-bit MSW of the transmit power of port n (where it ranges from 0 to
53 + 5n TXPOWERMSW
number of ports minus one)

Each power is a 64-bit value representing the sum of the individual powers of the number of samples specified in
METERLENGTH, divided by 256. To convert to dBFS, the formula is:
10*log10(256*power/(230 *METERLENGTH).

Running the DCL


Normally DPD is activated by using the DCL control modes (Table 13). PORTNUM need not be specified, as the
DCL automatically operates on all available ports. Various status monitors are provided (Table 14) and may be used
to implement error handling. In a multiport installation, n ranges from 0 to the number of ports minus one. To main-
tain backward compatibility with the v3.x DCL monitors, the location of the individual port monitors is defined in
Table 13, where k is defined as (using C notation): k = ( n < 4 ? n : n - 8 )
In a system where DPD has been running successfully, the appearance of an undesired status in, for instance,
UPDATE_INPROGRESS or LAST_UPDATED_STATUS typically indicates an abnormal signal condition such as a

DS811 September 21, 2010 www.xilinx.com 23


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

failed observation receiver or the occurrence of severe interference. If the ECF does encounter an error, the coeffi-
cients are not updated or stored. All ECF parameters are relevant to the DCL.
Table 13: DCL Control Modes
Mode Number Mnemonic Description
14 RUN_DCL[14] Run the DCL.
27 RUN_DCL_WITH_QMC[27] Run the DCL with QMC updates enabled.
Run the DCL with QMC updates enabled and
23 RUN_DCL_WITH_ACCEL_QMC[23]
QMC initial convergence is emphasized.
18 EXIT_DCL[18] Stop the DCL while retaining internal state.
Reset the DCL internal state. Executing
36 RESET_DCL[36]
SET_DCL_PARAMETERS(12) also does this.

Table 14: DCL Monitors


Address Mnemonic Description
The total number of ECF updates since the
function was started. This is the last register
written in this table. Detecting a change in this
392 + 32*k COUNTER
register can be used to indicate all values in this
table are stable for reading, subject to the
beginning of the next update.
0 – no ECF updates are currently occurring
1 – an ECF update using Damped-Newton is
being computed
2 – an ECF update using Least-Squares is being
395 + 32*k UPDATE_INPROGRESS computed
32 – TX power is too low to compute ECF
updates
33 – RX power is too low to compute ECF
updates
32-bit MSW of the time stamp corresponding to
396 + 32*k LAST_TIME_MSW the last update. Counted in MicroBlaze clock
cycles.
32-bit LSW of the time stamp corresponding to
397 + 32*k LAST_TIME_LSW
the last update.
Indicates the returned status of the ECF update
399 + 32*k LAST_UPDATED_STATUS
code (see Table 10).
Indicates the returned status of the QMC update
404 + 32*k LAST_QMC_UPDATED_STATUS
code.
The total number of QMC updates since the
405 + 32*k QMC_UPDATE_COUNTER
function was started.
Note: In the event of contacting Xilinx Support, it is useful to have a record of the DCL monitors.

DS811 September 21, 2010 www.xilinx.com 24


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Single Stepping
For fine control, parameter adjustment, debug, general understanding and non-standard applications DPD SW fea-
tures can be individually activated via the control modes given in Table 15. PORTNUM(2) needs to be specified.
Table 15: Single Stepping Control Modes
Mode Number Mnemonic Description
Perform a full Least-Squares update of the DPD
coefficients. This includes capturing new
2 COMPUTE_NEW_COEFFICIENTS
samples, ECF processing and updating the data
path parameters.
Set all coefficient sets to unity gain and update
3 RESET_COEFFICIENTS
the data path for pass-through.
Update the data path for pass-through without
7 DPD_OFF
changing the internally stored coefficients.
Update the data path from the internally stored
8 DPD_ON
coefficients.
Reset the internally stored QMC coefficients and
21 RESET_QMC
set the QMC data path block to pass-through.
22 QMC_SINGLE_STEP Update QMC by making a single iteration.
Update the QMC data path block from the
24 QMC_ON
internally stored coefficients.
Update the QMC data path block for
25 QMC_OFF pass-through without changing the internally
stored coefficients.
Perform a Damped-Newton iteration of the DPD
coefficients. This includes capturing new
33 DAMPED_UPDATE
samples, processing the ECF, and updating the
data path parameters.

Signal Analysis
To aid setup and debug, the control modes shown in Table 16 give access to the signals processed by DPD and mea-
surements made by DPD on those signals. PORTNUM(2) needs to be specified.
A capture can be triggered and the captured data, the (transmit) power and histogram can be read out.
Transfer of bulk data uses a paged mechanism. Each page is 128 data long and is available at addresses 384-511.
The capture RAM is 8192 samples long and therefore requires 64 page accesses. The parameter PAGENUMBER
(address 122) specifies a page number from 0 to 63. The first 4096 samples are the transmit data – each 32-bit data
consists of a concatenation of two 16-bit twos-complement data for the I (LSW) and Q (MSW) samples. The upper
4096 samples are the receive data that are again a concatenation of two 16-bit twos-complement data. When the
receiver is in 2*fs mode, these are the even and odd ordered samples of an 8192 sample sequence. In 1*fs mode, the
odd samples are ignored, and in IQ mode the format is to pack the two 16-bit SRx inputs into a single 32-bit value.
In all modes the SRx data will appear in captured samples as srx_din0*2^16 + srx_din1. Table 3 provides details
on srx_din0/srx_din1 interface.
The histogram integrated over METERLENGTH can also be read out. The histograms are 256 samples long. The his-
togram bins are the number of samples of signal amplitude when amplitude is divided by 128.
The capture power is the sum of the capture signal power over the number of samples specified in
SAMPLES2PROCESS(96).

DS811 September 21, 2010 www.xilinx.com 25


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Table 16: Signal Analysis Control Modes


Mode Number Mnemonic Description
Present the page, specified by PAGENUMBER(122), of the
4 GET_CAPTURE_RAM_PAGE capture RAM data at the host interface RAM page transfer
area.
Present the page, specified by PAGENUMBER(122), of the
5 GET_HISTOGRAM_PAGE transmit histogram at the host interface RAM page transfer
area.
Present the page, specified by PAGENUMBER(122), of the
15 GET_CAPTURE_HISTOGRAM_PAGE capture histogram at the host interface RAM page transfer
area.
Present the values from the capture Tx power meter at
16 READ_CAPTURE_POWER_METERS addresses 384(LSW) and 385(MSW) and the capture Rx
power at addresses 386(LSW) and 387(MSW).
Trigger a new sample capture sequence. The capture follows
20 CAPTURE_NEW_SAMPLES
the rules set by the CAPTUREMODE(117) parameter.

Examples of uses for the signal analysis control modes are to:
• check the transmit and receive spectra (by analyzing the captured data in a tool such as MATLAB®) and
thereby verify that the signal source, RF paths, core interfaces and relevant DPD parameters are correct.
• check the CFR configuration (by examining the transmit histogram).
• determine appropriate settings for CAPTUREDELAY(118) in capture mode 1 by examining the capture
histogram and powers relative to the measurements over METERLENGTH.
Note: In the event of contacting Xilinx Support, it is useful to have the signal analysis data described in this section.

Antenna Selection Options in a Multipath Installation


For multiple-antenna applications, DPD assumes that there will be an RF or digital switch selecting the various
observation paths to route to the sample receiver. The most transparent mode of operation is if the switch control is
available as signals in the FPGA, in which case they should be wired to srx_path_sel port of the core.
In the event that the switch is accessible only via the application control plane, a SW handshake protocol is provided
for switching the receiver. This is enabled by executing the ENABLE_EXT_RXSEL(28) control mode and disabled
by ENABLE_INT_RXSEL(29).
To use this external select mode, these steps are to be followed:
1. Poll CODEPOINTER(11) until the value 131 is read. This is the request for a port switch.
2. Read the ACTIVEPORT(12) register to see which port needs to be switched in to the receive path.
3. Switch the port and acknowledge by writing 0xA5A5A5A5 into the WAITINGONPORTSWITCH(3) register.
In the internal mode, ACTIVEPORT indicates which port is active, and WAITINGONPORTSWITCH is ignored.

DS811 September 21, 2010 www.xilinx.com 26


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Resource Utilization and Performance


Resource Utilization
DPD IP provides various parameters and targets multiple FPGA families. Table 18, Table 19 and Table 20 provide
some examples of resource utilization on Virtex-5, Virtex-6 and Spartan-6. These resource numbers were obtained
when the designs were run stand-alone and all clock signals were constrained as described in Table 21.
ISE 12.3 tools were used to obtain this data. Non-default settings used for the various programs are described in
Table 17. These enable physical synthesis capability to offer better fan-out management for high fan-out ceN_out
and rst/proc_rst network. It is recommended that the user also employ these settings when instantiating DPD
IP in user logic.

Table 17: Tool Settings for Characterization


Tools Additional Settings
XST NONE
ngdbuild NONE
map -register_duplication on, -ignore_keep_hierarchy, -xe n
par NONE
.

Table 18: Resource Utilization on Virtex-5 (5vlx155-ff1153-1 on ISE 12.3)


Block
TX Architecture Clocks/ Poly. QMC HWA FFs LUTs Slices RAMs DSP48Es
Sample Order
36K(1)
1 D 4 7 False False 3340 2979 1550 55 14
1 D 4 7 False True 5138 4117 2045 69 34
1 D 4 7 True False 3449 2957 1435 55 17
1 D 4 7 True True 5253 4101 2204 71 37
2 D 4 7 False False 4348 3758 2045 63 21
4 D 4 7 False False 6355 5123 2796 83 35
8 D 4 7 False False 10391 7913 4080 115 63
4 D 4 5 False False 6331 5104 2545 79 35
4 C 4 5 False False 6837 4895 2999 63 35
4 B 4 5 False False 5884 4490 2449 55 35
4 A 4 5 False False 4921 4276 1731 47 35
4 D 3 5 False False 8885 6165 3695 111 51
4 D 2 5 False False 8421 6044 3376 79 51
4 D 1 5 False False 6087 5693 2383 111 83
2 D 1 5 False False 4164 4041 1788 79 45
1 D 1 5 False False 3187 3100 1499 63 26
1 C 1 7 False False 3081 2984 1433 56 22
2 C 3 7 False False 4377 3628 1885 59 21

Notes:
1. For Virtex-5 the BRAM usage is reported as a number of 36K BRAMs. Typically 2x18K BRAMs are combined and reported as 36K.

DS811 September 21, 2010 www.xilinx.com 27


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Table 19: Resource Utilization on Virtex-6 (6vsx315t-ff1759-1 on ISE 12.3)


Clocks/ Poly.
TX Architecture QMC HWA FFs LUTs Slices Block RAM
36K/18K(1) DSP48E1s
Sample Order
1 D 4 7 False False 3142 2956 1188 55/0 14
1 D 4 7 False True 4903 4262 1802 64/10 34
1 D 4 7 True False 3206 2948 1282 55/0 17
1 D 4 7 True True 5009 4299 1740 64/10 37
2 D 4 7 False False 4131 3600 1680 63/0 21
4 D 4 7 False False 6101 5050 2101 83/0 35
8 D 4 7 False False 10073 8091 3322 115/0 63
8 D 1 7 False False 9802 8993 3492 179/0 159
4 D 4 5 False False 6115 5015 2168 79/0 35
4 C 4 5 False False 5876 4980 1944 63/0 35
4 B 4 5 False False 5306 4569 1869 55/0 35
4 A 4 5 False False 4672 4127 1898 47/0 35
4 D 3 5 False False 8609 6508 2668 111/0 51
4 D 2 5 False False 8337 6241 2594 79/0 51
4 D 1 5 False False 5951 5592 2083 111/0 83
2 D 1 5 False False 3992 3875 1560 79/0 45
1 D 1 5 False False 3024 3016 1288 63/0 26
1 C 1 7 False False 2909 2866 1263 56/0 22
2 C 3 7 False False 3963 3547 1448 59/0 21

Notes:
1. In some configurations Virtex-6 uses a number of 18K BRAMs in addition to full 36K BRAMs.

DS811 September 21, 2010 www.xilinx.com 28


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Table 20: Resource Utilization on Spartan-6 (6slx150-fgg900-2 on ISE 12.3)


Block
TX Architecture Clocks/ Poly. QMC HWA FFs LUTs Slices RAMs DSP48Es
Sample Order
18K/9K(1)
1 D 2 7 False False 3694 3362 1591 105/3 19
1 D 2 7 False True 5497 4559 1891 128/5 39
1 D 2 7 True False 3775 3323 1585 105/3 22
1 D 2 7 True True 5612 4547 1991 128/5 42
2 D 2 7 False False 5219 4355 1951 121/3 30
4 D 2 7 False False 8255 6414 2788 161/3 52
4 D 1 7 False False 5969 5671 2289 161/3 84
2 D 2 5 False False 5218 4338 2010 121/3 30
2 C 2 7 False False 5036 4294 1907 113/3 30
2 C 2 5 False False 5049 4263 1889 97/3 30
2 B 2 5 False False 4546 3866 1752 89/3 30
2 A 2 5 False False 4064 3569 1711 81/3 30
1 C 2 5 False False 3583 3263 1535 85/3 19
1 C 1 5 False False 2946 2950 1339 82/3 23

Notes:
1. Spartan-6 cases use a number of 9K BRAMS in addition to full 18K BRAMs.

IP Timing Performance
DPD IP was characterized for resource utilization and timing performance on Virtex-5, Virtex-6 and Spartan-6
according to the constraints shown in Table 21. These are example cases of when a single DPD IP core is placed and
routed in an otherwise empty fabric. A user application might see a different performance (better or worse). It is rec-
ommended that the user place and route the desired DPD design configuration in the user application space with
representative logic around it or with representative area groups for floorplanning. Contact your Xilinx Field Appli-
cations Engineer if you have timing issues or if guidance on floorplanning is required. For 8 TX cases, an area group
is recommended to achieve these clock frequencies.

Table 21: DPD IP Timing Constraints and Timing Closure Statistics


clk Speeds proc_clk Speeds accel_clk Speeds
Virtex-5 (cps = 1) 300 MHz 150 MHz 200
Virtex-6 (cps = 1) 333 MHz 150 MHz 200
Spartan-6 (cps = 1) 142 MHz 75 MHz 80
Virtex-5 (cps = 2,3,4) 400 MHz 150 MHz 200
Virtex-6 (cps = 2,3,4) 400 MHz 150 MHz 200
Spartan-6 (cps = 2,3,4) 166 MHz 75 MHz 80

DS811 September 21, 2010 www.xilinx.com 29


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

SW Features Timing Performance


Table 22 shows the execution times in seconds for various DPD features with a MicroBlaze processor clock at
128MHz. The times scale linearly with clock speed. LS is for execution of a COMPUTE_NEW_ COEFFICIENTS(2)
control mode, or more generally ECF with DAMPEDNEWTONMU(104) equal to zero. dN is for execution of a
DAMPED_UPDATE(33) control mode, or more generally ECF with DAMPEDNEWTONMU non-zero. Non-fs/4
time increment occurs when RXPHASESTEP(108) is set to anything other that 0.25. When fs/4 is selected, an
optimized multiplier-free operation is used. As the DCL cycle time depends on the ECF, it may vary according to
the signal condition, as dN or LS is selected automatically based on stability criteria.
The number of samples processed has a significant effect on the execution times. If the times are important,
SAMPLES2PROCESS(95) may be reduced from the 4000 samples default, but performance should be verified. In
internal testing, reduction to 2000 samples has not affected performance.

DS811 September 21, 2010 www.xilinx.com 30


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Table 22: SW Features Timing (seconds) for POLY_ORDER = 5


SAMPLES2PROCESS
1000 2000 4000
LS dN LS dN LS dN
ECF, ARCH A 0.48 0.56 0.72 0.90 1.20 1.56
ECF, ARCH B 0.63 0.74 1.01 1.25 1.77 2.26
ECF, ARCH C 0.84 0.98 1.36 1.67 2.43 3.08
ECF, ARCH D 0.82 1.01 1.35 1.72 2.42 3.15
Non-fs/4 down-conversion ECF + 0.17 ECF + 0.17 ECF + 0.17
DCL cycle with QMC, per port ECF + 0.25 ECF + 0.25 ECF + 0.25

Table 23: SW Features Timing (seconds) for POLY_ORDER = 5 and HWA enabled
SAMPLES2PROCESS
1000 2000 4000
LS dN LS dN LS dN
ECF, ARCH A 0.34 0.34 0.43 0.42 0.58 0.60
ECF, ARCH B 0.39 0.39 0.47 0.49 0.63 0.66
ECF, ARCH C 0.46 0.47 0.54 0.57 0.72 0.72
ECF, ARCH D 0.46 0.47 0.55 0.56 0.74 0.76
Non-fs/4 down-conversion ECF + 0.17 ECF + 0.17 ECF + 0.17
DCL cycle with QMC, per port ECF + 0.25 ECF + 0.25 ECF + 0.25

Table 24: SW Features Timing (seconds) for POLY_ORDER = 7


SAMPLES2PROCESS
1000 2000 4000
LS dN LS dN LS dN
ECF, ARCH A 0.62 0.72 0.98 1.20 1.74 2.18
ECF, ARCH B 0.95 1.09 1.56 1.87 2.80 3.49
ECF, ARCH C 1.39 1.58 2.35 2.75 4.24 5.08
ECF, ARCH D 1.32 1.54 2.21 2.70 4.01 5.01
Non-fs/4 down-conversion ECF + 0.17 ECF + 0.17 ECF + 0.17
DCL cycle with QMC, per port ECF + 0.25 ECF + 0.25 ECF + 0.25

Table 25: SW Features Timing (seconds) for POLY_ORDER = 7 with HWA enabled
SAMPLES2PROCESS
1000 2000 4000
LS dN LS dN LS dN
ECF, ARCH A 0.38 0.39 0.46 0.48 0.66 0.63
ECF, ARCH B 0.49 0.50 0.59 0.60 0.79 0.77
ECF, ARCH C 0.69 0.70 0.79 0.81 1.00 0.99
ECF, ARCH D 0.66 0.66 0.74 0.75 0.96 0.97
Non-fs/4 down-conversion ECF + 0.17 ECF + 0.17 ECF + 0.17
DCL cycle with QMC, per port ECF + 0.25 ECF + 0.25 ECF + 0.25

DS811 September 21, 2010 www.xilinx.com 31


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Distortion Correction Performance


Performance testing has been conducted on the Xilinx DPD solution using an industry standard radio card, power
amplifier and test equipment. The method is to instantiate the Xilinx DPD solution with supporting logic and mem-
ory that allow the transmission of data representative of any air interface standard. Test results are presented here
for WCDMA, WiMAX, LTE, TD-SCDMA and Multicarrier GSM. Mixed mode operation is also supported.
The hardware platform is based on the Axis CDRSX2 (Common Digital Radio System, Xilinx edition, 2nd genera-
tion) board; spectral measurements are obtained from an Agilent E4440A PSA. The plots that follow were produced
from captured trace data. For static spectra, the PA is a 3rd party LDMOS Doherty Amplifier at 46.5 dBm. Results of
dynamics testing were taken with the Stealth Microwave SM2122-51LD power amplifier; the sample rate for DPD
is 122.88 Msps and the observation path is a real IF sampled at 245.76 Msps.
The results that follow show the spectrum before DPD is applied and after the DCL has been run for 30 seconds
with QMC correction enabled. Typically the spectra will converge much faster than this, particularly if QMC cor-
rection has already been established. In the following results, the relative correction is shown. Whether or not a par-
ticular spectral mask requirement is met must be considered in relation to the power that the PA is being driven to.
In the following results, CFR is used with a threshold set at -9 dBFS. This results in a PAR of between 6 and 6.5 dB
for the FDD signals and for the maximum power segments of the TDD signals.

WCDMA
Multicarrier data consisting of 3GPP Test Model 1 with 64 DCH is generated. Each carrier has a relative offset of 512
chips. The data is pulse-shaped, upsampled to 122.88 Msps, frequency shifted and summed.
Figure 9 and Figure 10 show the spectra before and after pre-distortion, for the ARCH_SEL selections indicated, for
the carrier configurations stated in the captions.

DS811 September 21, 2010 www.xilinx.com 32


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

X-Ref Target - Figure 9

Figure 9: Spectra for Four WCDMA Carriers before and after DPD

DS811 September 21, 2010 www.xilinx.com 33


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

X-Ref Target - Figure 10

Figure 10: Spectra for Two Non-adjacent WCDMA Carriers 10 MHz Apart before and after DPD

DS811 September 21, 2010 www.xilinx.com 34


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

WiMAX
Data is generated using Agilent Signal Studio, which is standards compliant, having arbitrary zone, burst and mod-
ulation structure. It is 5 ms TDD frame data. Figure 11 shows the spectrum before and after pre-distortion for two
10 MHz carriers (one having 75% downlink active ratio and the other having 50% downlink active ratio).
X-Ref Target - Figure 11

Figure 11: Spectra for Two 10MHz WiMAX Carriers before and after DPD

DS811 September 21, 2010 www.xilinx.com 35


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

LTE
Data is generated using internally developed software. The data is standards compliant with respect to frame struc-
ture and modulation. The modulation scheme is 64 QAM and the data payload is random. Figure 12 shows results
for one 20 MHz carrier.
X-Ref Target - Figure 12

Figure 12: Spectra for a Single 20MHz LTE Carrier before and after DPD

DS811 September 21, 2010 www.xilinx.com 36


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

TD-SCDMA
Data is generated using internally developed software. The data is standards compliant with respect to frame struc-
ture and modulation. The data payload is random. Figure 13 shows the results for an arbitrary selection of carriers
within a 15 MHz bandwidth.
X-Ref Target - Figure 13

Figure 13: Spectra for Six TD-SCDMA Carriers in 15MHz Total Bandwidth Carrier before and after DPD

DS811 September 21, 2010 www.xilinx.com 37


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Multicarrier GSM
Data is generated using internally developed software. The data is standards compliant with respect to frame struc-
ture and modulation, which is GMSK. The data payload is random. Figure 14 shows the result for an arbitrary selec-
tion of carriers within a 20 MHz bandwidth.
X-Ref Target - Figure 14

Figure 14: Spectra for Four GSM Carriers in 20MHz Total Bandwidth Carrier before and after DPD

DS811 September 21, 2010 www.xilinx.com 38


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Dynamic and QMC Performance


Representative dynamic performance is shown in Figure 15; QMC performance is shown in Figure 16.
X-Ref Target - Figure 15

Figure 15: Dynamic Performance: Adjacent Channel Ratio for Four WCDMA Carriers
with Total Power Varying with Slow Steps, Fast Steps and Fast Random Profiles
X-Ref Target - Figure 16

Figure 16: QMC Performance: Spectra for a Single Offset WCDMA Carrier
before and after QMC and DPD Correction

DS811 September 21, 2010 www.xilinx.com 39


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Factors Influencing Expected Correction Performance


Overview
Because DPD is a system-level function involving subtle interactions between digital, RF and PA design, it is diffi-
cult to make hard-and-fast rules about when the demonstrated performance will be achieved. Prototyping is highly
recommended, and Xilinx Support can help with the provision of evaluation platforms in certain circumstances.
Nonetheless, the following sections provide some guidance.

Sample Rates
Performance depends on the sample rate of DPD. A rule of thumb is that the pre-distortion bandwidth fs should be
at least five times the signal bandwidth. However factors such as the PA design, the degree of correction required
and the signal type will come into play. Non-contiguous carrier configurations generally require a higher DPD sam-
ple rate than contiguous carrier configurations.
Excess pre-distortion bandwidth can also be a problem. Occasionally wideband artifacts can be observed when fs is
greater than approximately seven times the signal bandwidth, particularly if ARCH_SEL is set to 2, 3 or 4.
For optimal receive bandwidth, the DPD sample receiver rate should be exactly twice the DPD sample rate and
have the signal centered in a Nyquist zone. However variances are supported. DPD can be configured for a sample
receiver at one times the sample rate, and in many situations there will be little performance degradation. Non-con-
tiguous carrier configurations, however, may be particularly problematic.
There is also support for a signal not exactly centered in a Nyquist zone. If the offset is small, there may be little
impact on performance.
A direct conversion receiver can also be used. In this case, QMC will be unreliable unless the receiver is individually
and externally calibrated.

Required Signal Levels and Properties


The supplied design works with 16-bit transmit and up to 16-bit receive signals. The Sample Receiver (SRx) Inter-
face section details how to use fewer bits.
In a live BTS, the transmit digital signal power will vary with call load dynamics. At the maximum output power
of the BTS, the scaling of the transmit signal should be such that it is optimally at -15 dBFS rms power. This value
will allow sufficient headroom for PAR and expansion for pre-distortion correction. Lower values may compromise
the dynamic range. Some adjustment is possible, but headroom must be preserved, and scaling the signal too low
may compromise performance.
DPD may, however, operate correctly at lower signal levels, but in any case the software is set to not attempt
pre-distortion at signal levels below -30 dBFS while running under DCL control (single updates are not subject to
the minimum power level checks in software). Scaling to levels higher than -15 dBFS is feasible, but headroom for
pre-distortion expansion and other factors should be checked explicitly.
The power in the sample receiver should be at -15 dBFS measured on the real signal, as reported by the internal
DPD power meters. The signal level control is in the user's domain. The preceding test results were collected with
this scaling, with a 12-bit ADC. Reasonable pre-distortion correction may be obtained with fewer ADC bits, but this
should be explicitly verified in the intended installation.
CFR is helpful for good pre-distortion performance and highly recommended as a means of optimizing PA effi-
ciency. The exact degree will depend on the particular installation. The tests were conducted with approximately 6.5
dB PAR for maximum power signals (signal segments in the case of pulsed and TDD signals). The Xilinx PC-CFR
v2.0 Core [Ref 1] is available for this purpose.

DS811 September 21, 2010 www.xilinx.com 40


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

RF Performance
The performance of DPD is intimately related to the quality of the RF design. The RF bandwidth should be at least
five times the signal bandwidth, but special considerations may apply at the edge of the band, depending on the RF
filter line-up. These are matters somewhat outside of the scope of the digital design Xilinx offers. Within the RF
bandwidth, we are unable to put limits on the amplitude and phase error that might be tolerated. Performance with
RF paths worse than the Axis CDRSX2 test platform is unknown.

Parameters
The default and user-controllable settings described here normally give sufficient control for successful perfor-
mance in most operational scenarios. However DPD has a number of internal parameters and settings, and in some
cases performance issues can be addressed by changing these. Xilinx Support should be contacted for assistance.
See also the Support section of this document for support stipulations.

DS811 September 21, 2010 www.xilinx.com 41


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Abbreviations
3G Third Generation
3GPP Third Generation Partnership Project
ACP Adjacent Channel Power
ACLR Adjacent Carrier Leakage Ratio
ADC Analog-to-Digital Converter
BTS Base Transceiver Station
BUFG Global Buffer (Xilinx FPGA component)
CAPEX Capital Expenditure
CDRSX Common Digital Radio System – Xilinx Edition
CFR Crest Factor Reduction
CMP Configured Maximum Power
CPICH Common Pilot Channel
DAC Digital-to-Analog Converter
dB decibel
dBc dB relative to carrier
dBm dB relative to one milliwatt
dBFS dB relative to digital full-scale
DCH Dedicated Transport Channel
DCL Dynamic Control Layer
DCM Digital Clock Manager (Xilinx FPGA component)
DPCH Dedicated Physical Channel
DPD Digital Pre-Distortion
DUC Digital Up Conversion
ECF Estimation Core Function
FCC Federal Communications Commission
FIFO First In, First Out
FIR Finite Impulse Response
HSDPA High Speed Downlink Packet Access
ISR Interrupt Service Routine
LDMOS Laterally Diffused Metal Oxide Silicon (Field Effect Transistor)
LMB Local Memory Bus
LO Local Oscillator
LTE Long Term Evolution
LUT Lookup Table
MCP Maximum Capacity Power
Msps Mega-samples per second
MIMO Multiple Input Multiple Output
MP Memory-Polynomial
NSNL Non-Static Non-Linearity
ODD Over-Drive Detection

DS811 September 21, 2010 www.xilinx.com 42


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

ODP Over-Drive Protection


OPEX Operational Expenditure
PA Power Amplifier
PAR Peak-to-Average Ratio
PLB Peripheral Local Bus
QAM Quadrature Amplitude Modulation
QM Quadrature Modulator
QMC Quadrature Modulator Correction
QSNL Quasi-Static Non-Linearity
PAR Peak-to-Average Ratio
RMS Root Mean Square
RRC Root-Raised Cosine
SA Spectrum Analyzer
SBRAM Shared Block RAM
SCA Sample Capture Acceptance
SRx Sample Receiver
TDD Time Division Duplex
TD-SCDMA Time Division Synchronous Code Division Multiple Access
TM1 Test Model 1 (and similarly TM2, and so on)
WCDMA Wideband Code Division Multiple Access
WiMAX Worldwide Interoperability for Microwave Access

DS811 September 21, 2010 www.xilinx.com 43


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

References
1. Xilinx Peak Cancellation Crest Factor Reduction (PC-CFR) V2.0 product page

Evaluation
An evaluation license is available for this core. The evaluation version operates in the same way as the full version
for several hours, dependant on clock frequency. The data output will comprise a delayed version of the data input,
once the evaluation period ends. The host interface shall report EVAL_LICENSE_TIMEOUT status value (see
Table 10) once the hardware times out. If you notice this behavior in hardware, it probably means you are using an
evaluation version of the core. The Xilinx tools warn that an evaluation license is being used during netlist
implementation. If a full license is installed, delete the old XCO file, reconfigure and regenerate the core.

Support
Xilinx provides technical support for this LogiCORE product when used as described in the product
documentation. Xilinx cannot guarantee timing, functionality, or support of product if implemented in devices that
are not defined in the documentation, if customized beyond that allowed in the product documentation, or if
changes are made to any section of the design labeled DO NOT MODIFY.
Refer to the IP Release Notes Guide (XTP025) for further information on this core. There is a link to all the DSP IP
and then to each core. For each core, there is a master Answer Record that contains the Release Notes and Known
Issues list for each core. The following information is listed for each version of the core:
• New Features
• Bug Fixes
• Known Issues

Ordering Information
This core may be downloaded from the Xilinx IP Center for use with the Xilinx CORE Generator software v12.3 and
later. The Xilinx CORE Generator system is shipped with Xilinx ISE Design Suite development software.
To order Xilinx software, contact your local Xilinx sales representative.
Information on additional Xilinx LogiCORE IP modules is available on the Xilinx IP Center.

Revision History
The following table shows the revision history for this document:

Date Version Description of Revisions


09/21/10 1.0 Product release 12.3

DS811 September 21, 2010 www.xilinx.com 44


Product Specification
LogiCORE IP Digital Pre-Distortion v4.0

Notice of Disclaimer
Xilinx is providing this product documentation, hereinafter “Information,” to you “AS IS” with no warranty of any kind, express
or implied. Xilinx makes no representation that the Information, or any particular implementation thereof, is free from any
claims of infringement. You are responsible for obtaining any rights you may require for any implementation based on the
Information. All specifications are subject to change without notice. XILINX EXPRESSLY DISCLAIMS ANY WARRANTY
WHATSOEVER WITH RESPECT TO THE ADEQUACY OF THE INFORMATION OR ANY IMPLEMENTATION BASED
THEREON, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OR REPRESENTATIONS THAT THIS
IMPLEMENTATION IS FREE FROM CLAIMS OF INFRINGEMENT AND ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Except as stated herein, none of the Information may be
copied, reproduced, distributed, republished, downloaded, displayed, posted, or transmitted in any form or by any means
including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of
Xilinx.

DS811 September 21, 2010 www.xilinx.com 45


Product Specification

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy