Population Pharmacokinetics II: Estimation Methods
Population Pharmacokinetics II: Estimation Methods
OBJECTIVE: To present, compare, and contrast the various approaches to estimating population pharmacokinetic (PPK) models with
respect to the mathematical foundation, statistical aspects, software programs for implementation, and underlying assumptions.
DATA SOURCES: Information on PPK was retrieved from a MEDLINE search (1977–August 2004) of literature and a bibliographic
review of review articles and books. This information is used in conjunction with experience to explain the various methodologic
approaches to PPK.
STUDY SELECTION AND DATA EXTRACTION: All articles indentified from data sources were evaluated and relevant information was
included in this review.
DATA SYNTHESIS: Over 80 articles dealing with PPK estimation methods and/or their implementation were identified and reviewed.
Sixty-four of these were chosen for their direct relevance to the subject of this article. Different estimation methods ranging from the
naïve averaging and naïve pooled approaches through the standard two-stage approach to the nonlinear mixed-effects modeling
approaches for estimating PPK are reviewed with their advantages and limitations.
CONCLUSIONS: PPK estimation methods that rely on the characterizing of mixed (fixed and random) effects are known to produce
PPK parameter estimates that are less biased than those obtained using the naïve and standard two-stage approaches. The
NONMEM software is the most widely used software for the characterization of PPK.
KEY WORDS: estimation, pharmacokinetics, population.
ploying PPK models compared with traditional pharma- NAÏVE AVERAGE DATA APPROACH
cokinetic model development. The previous installment of
It is common practice in preclinical and clinical pharma-
this series addressed the types of conceptual models neces-
cokinetics to perform studies in which the drug administra-
sary for an understanding of PPK.1 The current installment
tion and sampling schedules are identical for all subjects.
explains the various methods used to estimate PPK models.
For this type of analysis, there are as many data points as
there are individuals at each sampling time. Analysis of
See also Part I (2004;38:1702-6, DOI 10.1345/aph.1D374)
such data using the naïve averaging of data (NAD) ap-
proach consists of the following procedure.
(1) Computing the average value of the data for each
sampling time:
Over the past two and half decades, a variety of methods
N
have been proposed for the characterization of the PPK of
ȳi = 1/N Σ yij Eq. 1
drugs. A discussion of some of the methods follows and is
j=1
the focus of this section. The goals of a PPK analysis and the
data type will determine the method selected for the analysis. for i = 1,…., n where n is the standard number of individu-
al data. The averaging of data across individuals makes
sense, because all yij for j = 1, …, N have been measured
Author information provided at the end of the text. under identical conditions.
(2) A model ym = ƒ(φ) is fitted to the mean data n-vector Thus, the NAD approach is not a reliable method for phar-
ȳ = (ȳ 1,….ȳn)t and estimates the best-fit parameter values macokinetic data analysis.
φ*. The latter notation (φ*) is used to distinguish it from
individual estimates, denoted ^φ. NAÏVE POOLED DATA ANALYSIS
The NAD approach is attractive because of its simplici-
ty. One unique fitting is sufficient for obtaining estimates Sheiner and Beal5 proposed the naïve pooled data (NPD)
of parameters describing the mean response. φ* Compo- approach for the method in which all data from all individ-
nents are quite often interpreted as “mean” parameter val- uals are considered as arising from one unique individual.
ues. Correspondingly, ^µ NAD will be used for φ* in the lat- This reference subject is characterized by a set of pa-
ter. The method is widely applicable in experimental data rameters φ̄. With least-squares fitting, φ̄ will be the parame-
(EP) studies with standardized designs, including bioavail- ter vector minimizing the global objective function.
ability, bioequivalence, and dose proportionality studies. N nj
Because of the smoothing effect of averaging, mean data ONPD(φ) = Σ { Σ [yij — ƒij(φ)]2} Eq. 2
j=1i=1
generally look nicer than individual data, and better fitting
often results when compared with individual data. where {ƒij, i = 1,….,nj} is the set of components of ƒj, and
However, the NAD approach provides an estimate of the summation is over all individuals and all measure-
the ^µ NAD sample mean. In this regard, several drawbacks ments for a given individual.
of this approach must be pointed out. The use of NAD to Unlike the NAD approach, the NPD approach is far
establish a pharmacokinetic model may be misleading. more general. It can easily deal with experimental data,
Data averaging can, quite often, produce a distorted picture nonstandard data, and routine pharmacokinetic data. After
of the response. Averaging of monoexponential data from a unique fitting of all data at once, parameter estimates are
2 subjects with very different half-lives has been shown to obtainable. It may perform well when variations between
produce a mean curve that exhibits an apparent biexponen- subjects are small. This is occasionally the case in a group
tial decay.2 Sometimes the opposite situation is the case. of homogeneous laboratory animals from a given strain,
The smoothing effect of the averaging will tend to obscure but it is rarely true for humans. The drawbacks of NPD are
peculiarities that can be seen in individual data. The exis- the same as those of NAD, as has been repeatedly pointed
tence of secondary peaks in the plasma level–time course out.6-8 The NPD approach tends to confound individual
of individuals may be undetectable in the average curve if differences and diverse sources of variability in a manner
the rebounds occur at different time points. different from the NAD approach, but with similar nega-
NAD also performs poorly in terms of parameter esti- tive consequences. The NPD estimate for the reference in-
mation. The reference to individual data disappears after dividual φ̄ should be considered as a rough approximation
data averaging. All sources of variability are confounded. (^µ NPD) of the population expectation µ, although the con-
Because of this, important information on drug disposition sequences of the omission can be minor.9 In addition, esti-
is obscured. The average concentration curve derived with mates of the dispersion of parameters in the population are
the NAD approach does not necessarily follow the individ- not provided. Extrapolation of mean outcomes on the basis
ual model function. A wrong model may be obtained.3 Un- of the set of estimates ^µ NPD should be done with caution.
defined statistical uncertainties and large “unknown” sub- These problems notwithstanding, it has been shown
ject variations might smooth the average response curve in that, for several drugs used in anesthesia, a pooled analysis
an unpredictable manner. Thus, the NAD estimate ^µ NAD approach provided population mean parameters that, when
should not, as a general rule, be regarded as a valuable esti- prospectively tested, accurately predicted drug concentra-
mate of the expected value of pharmacokinetic parameters. tions after drug administration by a computer-controlled
This rule holds even if the true model, that is, the one that infusion pump.10-12 The data in all circumstances originated
adequately describes the individual data, has been used for from well-controlled experiments with extensive sampling.
the fitting. The essential parametric nonlinearity of phar- That is, the data were of the EP type (see the installment of
macokinetic models is responsible for this. articles in this series of tutorials). Moreover, the NPD anal-
Exceptions to this rule occur when the signal-to-noise ysis provided similar population mean parameter estimates
ratio is small. This is the case when variability contributes compared with estimates obtained using several other pop-
less to the spread in observations than other sources of ulation analysis methods.13,14 These findings are in contrast
fluctuation (inter-occasion variability, measurement error, with an earlier simulation study which showed that the
model misspecification). This situation might be seen NPD approach provided biased estimates of the population
when concentrations are measured in standardized labora- mean parameters even when a well-balanced experimental
tory animals. The quality of estimates may be improved by study design was used.6 The discrepancy may be due to the
using averaging methods other than the straightforward large amount of interindividual variability present or inap-
arithmetic mean.4 These ad hoc solutions do not funda- propriate weighting scheme used in the latter study.
mentally solve the problem. Moreover, no estimate of pure Imbalance and confounding correlations present in a
interindividual variability can be obtained with the NAD data set pose serious problems for the NPD approach.
approach because it masks variability rather than reveals it. These features are prevalent in observational data and
make the NPD approach inappropriate for this type of lidity of its results should not be overemphasized. Howev-
data. Data imbalance occurs when there are many more er, it has been shown from simulation studies that the STS
observations taken from some individuals than others. An approach tends to overestimate parameter dispersion (the
example would be a case where 6 samples are taken from variance–covariance matrix).6,16
some individuals, 4 from some others, and one from others.
When the design of the study correlates with the out- Global Two-Stage Approach
come, confounding correlations occur. That is, the presence The ^φ can be viewed as observations of the individual
or absence of an observation is dependent on the subject’s
parameters. The estimate for a subject may be biased and
pharmacokinetics. Confounding correlations are usually
imprecise because of poor experimental design, poor study
prevented with randomization. This, however, is not guar-
execution, or a high level of measurement error. The glob-
anteed with observational data. A case in point would be a
al two-stage (GTS) approach makes extensive use of the
pharmacokinetic study in which concentrations fall below
matrices |Mj, j = 1,…., N|, which reflect the deviations
the limit of quantitation during the study. Only individuals
(bias), together with the estimates |^φj, j = 1,….., N|. The
with the smallest clearance or largest volume of distribu-
expectation E(.) and the variance–covariance Var(.) of each
tion would contribute measurable concentrations toward
(random) ^φj can be calculated as:
the end of the study. Biased estimate of the terminal half-
life will result and may be wrongly interpreted as an addi- E(^φj) = µ for j = 1, ....., N Eq. 5
tional phase of the pharmacokinetic profile. Clearly, the Var(^φj) = Mj + Ω for j = 1, ….., N Eq. 6
NPD approach should not be used in this setting. where µ is the true population expectation and Ω is the true
population variance–covariance. An extensive description
THE TWO-STAGE APPROACH of the method is provided by Steimer et al.16 The GTS ap-
proach provides a maximum likelihood estimate of µ and
With this approach, individual parameters are estimated Ω by an iterative method. It assumes that the estimates of in-
in the first stage by separately fitting each subject’s data dividual parameters are normally distributed around the true
and then, in the second stage, obtaining parameters across parameters with variance Varj. The population parameters θ
individuals, thus obtaining population parameter estimates. are the p components of the vector µ and the p(p + 1)/2 inde-
The data are summarized in the set [(^φj, Mj), j = 1,….., N]. pendent components of the symmetric matrix Ω. The objec-
^φj Is the p-vector of the parameter estimates and the p × p tive function to be minimized is as follows:
symmetric variance–covariance matrix of the correspond- N
ing individual estimate. To derive values for population OGTS(µ, Ω) = Σ [(^φj — µ)t (Mj + Ω)-1(^φj — µ) +
characteristics according to a given strategy, the individual j=1
parameter estimates are combined. The salient features of ln det(Mj + Ω)] Eq. 7
the methods that constitute the two-stage approach are dis-
cussed briefly. The first term in the right side of Equation 7 is the summa-
tion (over individuals) of the weighted squared deviations
Standard Two-Stage Approach of individual estimates from the expected value µ. The
weighting matrix is dependent on the quality of the esti-
The standard two-stage (STS) approach refers to a well- mate through the factor (Mj + Ω)-1. The last term in the
known and widely used procedure. Population characteris- equation is the logarithm of the determinant of the (Mj +
tics of each parameter are estimated as the empirical mean Ω) matrix. It prevents the variance–covariance matrix from
(arithmetic or geometric) and variance of the individual es- going to zero through its determinant.
timates ^φj according to the following equations: The GTS approach has been shown, through simulation,
N to provide unbiased estimates of the population mean pa-
^µ STS = 1/N Σ ^φj Eq. 3 rameters and their variance–covariances, whereas the esti-
j=1 mates of the variances were upwardly biased if the STS
N approach was used.16 These simulations were done under
^ΩSTS = 1/N Σ (^φj — ^µ STS)2 Eq. 4 the ideal situation that the residual error was normally dis-
j=1 tributed with a known variance. However, it is a well-known
fact that the asymptotic covariance matrix used in the calcu-
The estimate of the standard deviation (^s) is easily ob-
lations is approximate and, under less ideal conditions, that
tained by taking the square root of ^Ω. N — p can be used
instead of N in the denominator of the variance estimate. the approximation can be poor.17,18
With the STS approach, estimates of individual parame-
The Iterative Two-Stage Approach
ters are combined as if the set of estimates were a true N-
sample from a multivariate distribution. It has been recom- A computationally “heavier” two-stage method that re-
mended as a very simple and valuable approach for pooling lies on repeated fittings of individual data, the iterative
individual estimates of pharmacokinetic parameters de- two-stage (IT2S) approach, has been described.16,19,20 The
rived from experimental pharmacokinetic studies.15 The IT2S approach can be implemented with rich data, sparse
advantage of the STS approach is its simplicity, but the va- data, or a mixture of both. An approximate a priori popula-
tion model is required to initiate the procedure. Provided ties associated with this type of data preclude the use of the
that considerable informative data are available, the popu- STS approach because there are not enough data to sepa-
lation values may be obtained from the literature, the NPD rately estimate the pharmacokinetic parameters for each
approach performed with the current study data and a rea- subject. There are too few measurements to estimate the
sonable choice of parameter variability, or the STS ap- parameters accurately or the model may be unidentifiable
proach.16 As the name implies, the IT2S approach is imple- in a specific individual. As does the pooled analysis tech-
mented in 2 stages. In the first stage, the population model nique, nonlinear mixed-effects modeling approaches ana-
is used as the set of prior distributions for Bayesian estima- lyze the data of all individuals at once, but take the in-
tion of the individual parameters for all patients, irrespec- terindividual random effects structure into account. This
tive of the number of samples supplied by each individual. ensures that confounding correlations and imbalance that
In the second stage, the population parameters are recal- may occur in observational data are properly accounted for.
culated with these new individual parameters in order to Most of the nonlinear mixed-effects modeling methods
form the new set of prior distributions. The estimation pro- estimate the parameters by the maximum likelihood ap-
cess (ie, parameters from the second stage are used for a proach. The probability of the data under the model is writ-
repeat of the first stage and the results are used for a repeat ten as a function of the model parameters, and parameter
of the second stage) is repeated until the difference be- estimates are chosen to maximize this probability. This
tween the new and old prior distributions is essentially amounts to asserting that the best parameter estimates are
zero. The method may be implemented with programs those that render the observed data more probable than
supporting Bayesian estimation and least-squares regres- they would be under any other set of parameters.
sion or with the IT2S routine,20 which has been implement- It is difficult to calculate the likelihood of the data for
ed with the USC*PACK collection of programs.21 most pharmacokinetic models because of the nonlinear de-
A method close to the IT2S procedure is the expecta- pendence of the observations on the random parameters ηi
tion-maximization–like (EM) method presented by Mentre and possibly εij. To deal with these problems, several ap-
and Geomeni.22 It can be viewed as an extension of IT2S proximate methods have been proposed. These methods,
when both random and fixed effects are included in the apart from the approximation, differ widely in their repre-
model and for heteroschedastic errors known to a propor- sentation of the probability distribution of interindividual
tionality coefficient. This algorithm is implemented with random effects.
the software P-PHARM.23
First-Order (NONMEM)
Bayesian Two-Stage Approach
The first nonlinear mixed-effects modeling program in-
A method that is Bayesian in nature is that proposed by troduced for the analysis of large amounts of pharmacoki-
Racine-Poon.24 The method uses the estimates of the indi- netic data was NONMEM.31 In the NONMEM program,
vidual parameters φj and asymptotic variance matrix Vj ob- linearization of the model in the random effects is effected
tained from the individual fits, with very weak assumptions by using the first-order (FO) Taylor series expansion with
about the prior distribution of the population parameters to respect to the random effect variables ηi and εij. This soft-
calculate a posterior density function from which φ and Ω ware is the only program in which this type of linearization
can be obtained. In an iterative method suggested by is used.
Dempster et al.,25 the EM algorithm is used to calculate the The jth measurement in the ith subject of the population
posterior density function. Simulation studies in which sev- can be obtained from a variant of Equation 5 in the first tu-
eral varying and realistic conditions were assumed have torial as follows:
shown that the Bayesian two-stage approach provides good yij = ƒ(φ, xij,ηi) + εij Eq. 8
estimates of PPK and pharmacodynamic parameters.24,26 The FO Taylor series expansion of the above model with
respect to the random variables ηi (intersubject variability)
THE NONLINEAR MIXED-EFFECTS MODEL APPROACH and εij (residual variability) around zero is given by
yij = ƒ(φ, xij) + Gij(φ, xij)ηi + εij Eq. 9
The first attempt at estimating interindividual pharma- where
cokinetic variability without neglecting the difficulties (eg, Gij(φ, xij) = δƒ(θ,xijηi,εij)/δηiTηi = 0 Eq. 10
data imbalance, sparse data, subject-specific dosing histo- Gij(φ, xij) is 1 × p matrix of the first derivatives of
ry) associated with data from patients undergoing drug ƒ(θ,xijηi,εij) with respect to ηi, evaluated at ηi equals zero.
therapy was made by Sheiner et al.27 using the nonlinear In Equation 9, the model is linear in εij; therefore, no ap-
mixed-effects model approach. The vector θ of population proximation is made with respect to εij. Logarithmic trans-
characteristics is composed of all quantities of the first 2 formation of the data can be done to ensure linearity in εij.
moments of the distribution of the parameters: the mean The random effect parameters ηi and εij are independent
values (fixed effects) and the elements of the variance–co- (multivariate), normally distributed with zero means and
variance matrix that characterize random effects.5,6,8,28-30 variances Ω and σ2, respectively. Ω Is the p × p covariance
The number of samples per subject used for this ap- matrix of the p vector ηi. Based on the fact that ηi and εij
proach is typically small, ranging from 1 to 6. The difficul- are independent and identically normally distributed, and
the linearization of Equation 9, the expectation and vari- proach with the exact solution to the population likeli-
ance–covariance of all observations for the ith individual hood. No difference was observed, which indicated that
(first 2 moments) are given by: the approximation used in the FO method is not detrimen-
Ei = ƒ(θ,xi) Eq. 11 tal to the analysis under the conditions evaluated, which in-
and cluded an interindividual variability set at 25% (CV%).
Ci =Gi((θ,xi)ΩGi(θ,xi)T + σ2Ini Eq. 12 Other simulation studies, however, have shown that the FO
where ƒ(θ,xi) is the vector of model predictions of yi, approach has a potential for providing modestly biased es-
Gi((θ,xi) represents the ni × p matrix of first derivatives of timates.6,18,28,34,40-43
ƒ(θ,xi,ηi,εi) with respect to ηi evaluated at ηi equals zero, For a 1-compartment multidose scenario, White et al.18
and Ini represents the identity matrix of size ni. Maximum showed that biased estimates are more likely when residu-
estimates of the population parameters θ, Ω, and σ2 can be al and intersubject variability are very high. Ette et al.33 ob-
obtained by minimizing minus twice the logarithm of pop- served that the biased estimates are obtained at high levels
ulation likelihood as expressed below: of intersubject variability with a 2-compartment multidose
N situation, although the residual variability did not exceed
–2LL = Σ(log(det(Ci)) + (yi – Ei)TCi-1(yi — Ei)) Eq. 13 15%. The bias may be due to the fact that the FO Taylor
i=1 series expansion is not a particularly good approximation
This approach is called the FO method in NONMEM. of the underlying “real” (log-normal) distribution used to
This is the most widely used approach in PPK and phar- generate the simulated data in these studies. Also, it may
macodynamic data analysis and has been evaluated by be that the FO Taylor series expansion is evaluated at ηi
simulation. The use of the FO Taylor series expansion to equals zero (the population mean estimate of ηi). This may
approximate the nonlinear model in ηi and possibly εij by a not be a good approximation depending on the magnitude
linear model in these parameters is the greatest limitation of intersubject variability and the nonlinearity of the phar-
of the FO approach. macokinetic model. During data analysis, this can be com-
The performance of the FO approach for the analysis of pensated for, in part, by including explanatory covariates in
observational and experimental data has been evaluated by the model to reduce the variance of ηi. With a 1-compart-
Sheiner and Beal with the Michaelis–Menten pharmacoki- ment model experimental data set, the GTS approach was
netic model5 and the 1- and 2-compartment models.6,7 In all shown to outperform the FO approach with respect to bias
instances, a comparison was made with the NPD and STS and precision of both the population mean and variance es-
approaches for the analysis of the 2 types of data. The FO timates. Similar results were obtained in a study in which
approach outperformed the NPD and the STS approaches the FO approach was compared with the Bayesian two-
on both data types. Despite the approximation, the FO ap- stage approach.40
proach provides good parameter estimates. When the The NONMEM program implements 2 alternative esti-
residual error increases, the STS approach quickly deterio- mation methods: the FO conditional estimation (FOCE) and
rates, especially with respect to variance parameters. How- the Laplacian methods.31 The FOCE method uses an FO ex-
ever, the STS approach still performs reasonably well, but pansion about conditional estimates (empirical Bayesian esti-
the bias and imprecision of the estimates tend to increase mates) of the interindividual random effects rather than
with increasing residual error.7 Estimates of residual ran- about zero.44 In this respect, it is like the conditional FO
dom effects have been shown to deteriorate with the FO method of Lindstrom and Bates.45 Unlike the latter, which
approach when residual error increases.32 is iterative, a single objective function is minimized,
Deterioration in parameter estimation has been observed achieving a similar effect as with iteration. The Laplacian
in simulation studies in which the value of the intersubject method uses second-order expansions about the condition-
variability was >60% and the residual variability was set at al estimates of the random effects.44
15%.33 A series of studies in which observations were ran-
domly deleted from a data-rich set to create a sparse data Conditional First-Order (NLME)
set and parameter estimation was done using the FO
The conditional FO method of Lindstrom and Bates45
method showed good performance of the FO approach
uses an FO Taylor series expansion about conditional esti-
compared with the results obtained using the full data
mates of interindividual random effects. Estimation in-
set.34-38 The correspondence of the results in the 2 situa-
volves an iterative generalized least-squares type algo-
tions suggests that the FO approach can be used to esti-
rithm. This estimation method is available in S-PLUS as
mate parameters using only a few observations per indi-
the function NLME.46
vidual. Simulation studies have been performed to show
that the FO approach can be used in the limiting case where
Alternative First-Order (MIXNLIN)
only one sample is obtained per subject.39 In this case, there
is an upper limit of residual variability (not exceeding 20%) This method, proposed by Vonesh and Carter,47 also
for the production of reliable parameter estimates. uses an FO series expansion of the interindividual random
The impact of the linearization approximation of the effects. They proposed the use of estimated generalized
FO approach for a simple 1-compartment model was eval- least squares and established the asymptotic properties of
uated by Beal.29 He compared the performance of this ap- the resulting estimates. An alternative method is the use of
the iteratively reweighted generalized least squares.48 The mate ^F of the probability distribution of the parameters.
MIXNLIN program also implements pseudo maximum like- This distribution has been proven by Mallet to be discrete,
lihood (ML) and restricted maximum likelihood (REML) es- involving Np locations, where Np is less than or equal to
timation by embedding the EM algorithm within an iterative- the number of individuals (N). To estimate the Np locations
ly reweighted generalized least-squares routine. Expansion is qk and their corresponding frequencies αk, a specific algo-
either about zero or about the empirical best linear unbiased rithm was developed. The level of residual error and how
predictor (EBLUP) of the interindividual random effects. well the parameters are known determines the number of
Only the fixed-effects and variance component estimates locations. There will be N locations, each with a frequency
are updated after each call to the embedded EM algorithm of 1/N if the parameters are known very precisely for all N
(ie, the method uses the EBLUP estimates inherent within subjects. The set of locations qk and frequencies αk com-
the EM algorithm only to update estimates of the variance pletely specify the estimate of the distribution of the pa-
components) when the expansion is about zero. ML esti- rameters:
mation expanded about zero should result in estimates sim- Np
ilar to those obtained using the NONMEM FO method, ^F = ∑αk • δ(qk) Eq. 15
while expansion about the EBLUP should result in estimates k=1
similar to those obtained with the FOCE in NONMEM and where δ(x) denotes the Dirac probability distribution,
the FO conditional method (NLME). These estimation which takes the value 1 at x and 0 elsewhere. With this
methods are available in the SAS macro and MIXNLIN method, a complete distribution of F with very soft as-
3.0 version of Vonesh.48 sumptions, namely that F takes only positive values and
that its integral over D domain, is equal to unity.50,51 The
Alternative First-Order (SAS) NPML approach has been shown in a simulation study as-
This is an FO Taylor series expansion method, but the suming a 1-compartment pharmacokinetic model with bi-
algorithm consists of iteratively fitting a set of generalized modal distribution to produce parameter estimates that ac-
estimating equations until they stabilize.49 The method uses curately describe the distribution, even though only one
a Taylor series expansion in the fixed-effects parameters, measurement was available per individual.52 Several sum-
as well as one in the random effects; expansion is about the mary statistics, such as mean or variance–covariance ma-
generalized least-squares estimates for the fixed-effects pa- trix, can easily be calculated from the distribution of F
rameters and about zero for the random effects. It yields specified by Equation 15.
estimates similar to those obtained using the FO method of The method also allows for the inclusion of patient-spe-
NONMEM. The method is implemented in the SAS macro cific covariates without the specifying a priori relationship
NLINMIX. The NLINMIX program also implements ex- between the pharmacokinetic parameters and covariates.
pansion about the EBLUPs of the interindividual random The covariates are regarded as additional parameters, and
effects as an alternative to expansion about zero, yielding the algorithm provides an estimate of the joint distribution
estimates similar to those produced with the FOCE of the pharmacokinetic parameters and the covariates.53
method in NONMEM. The probability distribution of the parameters condition-
al on any value of the covariates can be computed and
Nonparametric Maximum Likelihood (NPML) used for the initial dosage selection, given the distribu-
tion obtained. Thus, the shape of the relationship be-
The NPML approach provides an estimate of the whole
tween parameters and covariates can be explored non-
probability distribution of the pharmacokinetic parameters
parametrically.
on a nonparametric basis.50 The method relies on maxi-
The major limitation of this approach is that the residual
mization of the likelihood of the set of observations of all
error must be known a priori. The method, therefore, is
individuals to estimate the distribution of the parameters.
nonparametric with respect to the interindividual random
The basic conceptual framework is similar to that de-
effects, but requires the intraindividual error to be specified
scribed above for NONMEM. The difference is that no
specific model for the relationship between pharmacoki- a priori. Pharmacokinetic analyses performed with the
netic parameters and patient-specific covariates is speci- NPML approach and reported in the literature have used
fied. The individual parameters φi are assumed to be inde- the residual error model based on drug concentration mea-
pendent realizations of a given random variable Φ with surement assay variance.52-55 This seems to be unrealistic.
probability distribution F(φ). The likelihood of all data is Intraindividual variability, inter-occasion variability, and
given by: model misspecification often will contribute significantly
to the residual error.56,57
N Also, the estimator of the distribution produced by the
L(F) = Π ∫D li(yi|φ)F(φ)dφ Eq. 14 NPML approach is a point estimator, and no results on the
i=1 accuracy of the estimation are obtained. Consequently,
where li(yi|φ) is the likelihood of the observations yi for ith care should be taken in interpreting the results, especially
individual, given φ. D is the domain in which the parame- when they are obtained from a small sample size. If the
ters lie. Maximization of this likelihood provides an esti- NPML approach is used primarily for exploratory analysis
to improve the efficiency of subsequent parametric analy- The SNP approach is implemented in a public domain
sis, this may not be much of a problem. The NPML ap- FORTRAN program called NLMIX. Experience with this
proach is a computationally expensive approach, which approach is still very limited, and only a few simulations
may limit the practicality of the approach when the dimen- have evaluated the ability of the method to reveal multiple
sion of the parameter space increases. An example of this modes in the random effects density under conditions like-
would be the case of a complex pharmacokinetic model ly to be encountered in practice.
with numerous covariates. A method similar to the SNP approach was proposed by
The nonparametric expectation–maximization (NPEM) Fattinger et al.64 to explore the complete distribution of in-
program of Schumitzky,58 which is similar to the NPML terindividual effects using the FOCE approach in the
program of Mallet,50 computes the nonparametric ML us- NONMEM program. The method uses a monotone non-
ing the nonparametric EM algorithm. NPEM has been de- decreasing spline to transform the normally distributed in-
veloped as a segment of the USC*PACK collection of pro- terindividual random effects. The model for the interindi-
grams.21 The results obtained using NPEM for PPK data vidual random effect model is given as:
analysis are similar to those of the NPML program. NPEM φi = g(θ, xi) + sp(ηi) Eq. 16
and STS give virtually identical estimates of PPK parame-
ters in the same population when the results of NPEM in- where sp(..) represents a monotone non-decreasing spline
dicate normal distribution for parameter estimates.59,60 of which the parameters are estimated. Because splines are
not multivariate, a different spline is used for each of the
Seminonparametric Maximum Likelihood (SNP) elements of ηi. The spline function transformation is very
flexible and allows appropriate representations of skewed,
Davidian and Gallant61 introduced the SNP maximum heavily tailed, or multimodal distributions.
likelihood from econometrics into pharmacokinetics. Like
the NPML approach, the SNP approach provides an esti-
Summary
mate of the entire distribution of the interindividual ran-
dom effects. The SNP approach maximizes the likelihood Thus far, the principles that serve as the foundation and
over a class of distributions restricted to have a smooth the methods for PPK model estimation have been present-
density instead of maximizing the likelihood over all ed. These concepts are important so that the application of
distribution functions, as does the NPML method. This as- PPK will be executed in an informed manner. The current
sumption of smoothness is flexible enough to allow heavy- article serves as a bridge to the final PPK tutorial paper,
tailed, multimodal, and skewed distributions to be character- which will address application of PPK modeling with in-
ized, but prevents kinks, jumps, and oscillatory behavior.62 formative examples.
Also, this method relies on maximizing the likelihood of
the set of observations of all individuals to estimate the dis- Ene I Ette MSc PhD FCP FCCP, Senior Director of Clinical Phar-
tribution of the random effects. The basic conceptual macology, Vertex Pharmaceuticals, Inc., Cambridge, MA
framework remains the same as that described for the pop- Paul J Williams PharmD MS FCP FCCP, Professor of Pharmacy,
Department of Pharmacy Practice, School of Pharmacy, University
ulation model in the “Models” subsection of the first in- of the Pacific; Trials by Design, LLC, Stockton, CA
stallment in this series of tutorials.1 The representation of Reprints: Ene I Ette MSc PhD FCP FCCP, Vertex Pharmaceuti-
the probability distribution and calculation of the likeli- cals, Inc., 130 Waverly St., Cambridge, MA 02139-4242, fax 617/444-
hood are different from the NONMEM and NPML ap- 6713, Ene_Ette@vpharm.com
proaches. It has been shown by Gallant and Nychka63 that
the smooth distribution can be presented as an infinite se- References
ries expansion, and they provide a full mathematical de-
1. Ette EI, Williams PJ. Population pharmacokinetics I: background, concepts,
scription. The SNP approach uses a finite number of lead- and models. Ann Pharmacother 2004;38:1702-6. DOI 10.1345/aph.1D374
ing terms resulting from an approximation of the infinite 2. Levy G, Hollister LE. Inter- and intrasubject variations in drug absorp-
expansion. A single tuning parameter determines the num- tion kinetics. J Pharm Sci 1964;53:1446-52.
ber of terms retained. The density is multivariate normal if 3. Martin E, Moll W, Schmid P, Dettli L. Problems and pitfalls in estimat-
ing average pharmacokinetic parameters. Eur J Clin Pharmacol 1984;26:
the value of this tuning parameter equals zero. The distri- 595-602.
bution becomes more flexible the larger the value of the 4. Cochetto DM, Wargin WA, Crow JW. Pitfalls and valid approaches to
tuning parameter. An important step in the modeling pro- pharmacokinetic analysis of mean concentration data following intra-
venous administration. J Pharmacokinet Biopharm 1980;8:539-52.
cedure is the selection of an appropriate value of this tun- 5. Sheiner LB, Beal SL. Evaluation of methods for estimating population
ing parameter.61 The density of the random effect parame- pharmacokinetic parameters. I. Michealis–Menten model: routine clini-
ters is represented by a multivariate normal distribution cal data. J Pharmacokinet Biopharm 1980;8:553-71.
multiplied by a polynomial. The SNP approach computes 6. Sheiner LB, Beal SL. Evaluation of methods for estimating population
pharmacokinetic parameters. I. Biexponential model and experimental
the integral present in the population likelihood by quadra- pharmacokinetic data. J Pharmacokinet Biopharm 1981;9:635-51.
ture. This is another useful feature of this approach. This 7. Sheiner LB, Beal SL. Evaluation of methods for estimating population
obviates the use of the linearization approximation to the pharmacokinetic parameters. I. Monoexponential model and routine
clinical data. J Pharmacokinet Biopharm 1983;11:303-19.
likelihood used in the NONMEM approach. Unlike the
8. Sheiner LB, Beal SL. Estimation of pooled pharmacokinetic parameters
NPML approach, standard errors can be computed for the describing populations. In: Endrenyi L, ed. Kinetic data analysis. New
model parameters and used for inference. York: Plenum Press, 1981:271-84.
9. Fluhler H, Huber H, Widmer E, Brechbuhler S. Experiences in the appli- 35. Collart L, Blashke TF, Boucher F, Prober CG. Potential of population
cation of NONMEM to pharmacokinetic data analysis. Drug Metab Rev pharmacokinetics to reduce the frequency of blood sampling required for
1984;15:317-39. estimating kinetic parameters in neonates. Dev Pharmacol Ther 1992;18:
10. Dyck JB, Haack DL, Azarnoff L, Vuorilehto L, Shafer SL. Computer- 71-80.
controlled infusion of intravenous dexmedetomide hydrochloride in 36. Aarons L, Mandema JW, Danhof M. A population analysis of the phar-
adult human volunteers. Anesthesiology 1993;78:821-8. macokinetics and pharmacodynamics of midazolam in the rat. J Pharma-
11. Shafer SL, Varvel JR, Aziz N, Scott JC. Pharmacokinetics of fentanyl cokinet Biopharm 1991;19:485-96.
administered by computer-controlled infusion pump. Anesthesiology 37. Kaniwa N, Aoyagi N, Ogata H, Ishi M. Application of the NONMEM
1990;73:1091-102. method to evaluation of the bioavailability of drug products. J Pharm Sci
12. Gustafsson LL, Ebling WF, Osaki E, Harapat S, Stanski DR, Shafer SL. 1990;79:1116-20.
Plasma concentration clamping in the rat using a computer-controlled in- 38. Pai SM, Shukla UA, Grasela TH, Knupp CA, Dolin R, Valentine FT, et
fusion pump. Pharm Res 1992;9:800-7. al. Population pharmacokinetic analysis of didanosine (2’,3’-dideoxyi-
13. Kataria BK, Ved SA, Nicodemus HF, Lea D, Dubios MY, Mandema JW, nosine) plasma concentration obtained in phase I clinical trials in patients
et al. The pharmacokinetics of propofol in children using three different with AIDS or AIDS-related complex. J Clin Pharmacol 1992;32:242-7.
data analysis approaches. Anesthesiology 1994;80:104-22. 39. Jones CD, Sun H, Ette EI. Designing cross-sectional population pharma-
14. Egan TD, Lemmens HJ, Fiset P, Hermann DJ, Muir KT, Stanski DR, et cokinetic studies: implications for pediatric and animal studies. Clin Res
al. The pharmacokinetics of the new short-acting opioid remifentanil Reg Affairs 1996;13:133-65.
(GI87084B) in healthy adult male volunteers. Anesthesiology 1993;79: 40. Racine A, Grieve AP, Fluhler H, Smith AFM. Bayesian methods in prac-
881-92. tice: experiences in pharmaceutical industry. Appl Stat 1986;35:1-38.
15. Rodda BE. Analysis of sets of estimates from pharmacokinetic studies. 41. Ette EI, Kelman AW, Howie CA, Whiting B. Influence of interanimal
In: Endrenyi L, ed. Kinetic data analysis. New York: Plenum Press, variability on the estimation of population pharmacokinetic parameters
1981:285-97. in preclinical studies. Clin Res Reg Affairs 1994;11:121-39.
16. Steimer JL, Mallet A, Golmard JL, Boisvieux JF. Alternative approaches 42. Ette EI, Howie CA, Kelman AW, Whiting B. Experimental design and
to estimation of population pharmacokinetic parameters: comparison efficient parameter estimation in preclinical pharmacokinetic studies.
with nonlinear mixed effects model. Drug Metab Rev 1984;15:265-92. Pharm Res 1995;12:729-37.
17. Sheiner LB, Beal SL. A note on confidence intervals with extended least 43. Karlsson MO, Sheiner LB. The importance of modeling interoccasion
squares parameter estimates. J Pharmacokinet Biopharm 1987;15:93-8. variability in population pharmacokinetic analyses. J Pharmacokinet
18. White DB, Walawander CA, Tung Y, Grasela TH. An evaluation of Biopharm 1993;21:735-50.
point and interval estimates in population pharmacokinetics using NON- 44. Beal SL, Sheiner LB. NONMEM users guide—part VII. Conditional es-
MEM analysis. J Pharmacokinet Biopharm 1991;19:87-112. timation methods. San Francisco: University of California, 1992.
19. Prevost G. Estimation of a normal probability density function from 45. Lindstrom MJ, Bates DM. Nonlinear mixed effects models for repeated
samples measured with non-negligible and non-constant dispersion, In- measures data. Biometrics 1990;46:673-87.
ternal Report 6-77, Anders-Gerbios, F-91120 Palaiseau, France, 1977. 46. S-PLUS. Seattle: Insightful, 2002.
20. Forrest A, Ballow CH, Nix DE, Birmingham MC, Schentag JJ. Develop- 47. Vonesh EF, Carter RL. Mixed-effects nonlinear regression for unbal-
ment of a population pharmacokinetic model and optimal sampling strat- anced repeated measures. Biometrics 1992;46:673-87.
egy for intravenous ciprofloxacin in seriously ill patients. Antimicrob 48. Vonesh EF. Nonlinear models for the analysis of longitudinal data. Stat
Agents Chemother 1993;37:1065-72. Med 1992;11:1929-54.
21. Jellife RW, Schumitzky A, Van Guillder M. User manual for the non- 49. Wolfinger RD. Laplace’s approximation for nonlinear mixed models.
parametric EM program for population modeling, version 2.17. Los An- Biometrika 1993;80:791-5.
geles: Laboratory for Applied Pharmacokinetics, USC School of 50. Mallet A. A maximum likelihood estimation method for random coeffi-
Medicine, December 15, 1993. cient regression models. Biometrika 1986;73:645-56.
22. Mentre F, Geomeni R. A two-step iterative algorithm for estimation in 51. Steimer JL, Mallet A, Mentre F. Estimating interindividual pharmacokinetic
nonlinear mixed-effect models with an evaluation in population pharma- variability. In: Rowland M, Sheiner L, Steimer JL, eds. Variability in drug
cokinetics. J Biopharm Stat 1995;5:141-58. therapy: description, estimation, and control. New York: Raven Press,
23. P-PHARM user’s guide, version 1.3, Créteil, France: SIMED, 1994. 1985:65-111.
24. Racine-Poon A. A Bayesian approach to nonlinear random effects mod- 52. Mentre F, Mallet A. Experiences with NPML—application to dosage in-
els. Biometrics 1985;41:1015-23. dividualisation of cyclosporine, gentamicin and zidovudine. In: Rowland
25. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incom- M, Aarons L, eds. The population approach. Luxembourg: Commission
plete data via EM algorithm. J Roy Stat Soc B 1977;39:1-38. of European Communities, 1992:75-90.
26. Racine A, Grieve AP, Fluhler H, Smith AFM. Bayesian methods in prac- 53. Mallet A, Mentre F, Gilles J, Kelman AW, Thomson AN, Bryson SM, et
tice: experiments in the pharmaceutical industry. Appl Stat 1986;35:93- al. Handling covariates in population pharmacokinetics with an applica-
100. tion to gentamicin. Biomed Meas Inform Contr 1988;2:673-83.
27. Sheiner LB, Rosenberg B, Melmon KL. Modeling of individual pharma- 54. Mallet A, Mentre F, Steimer JL, Lokiec F. Nonparametric maximum
cokinetics for computer-aided drug dosage. Comp Biomed Res 1972;5: likelihood estimation for population pharmacokinetics, with application
441-59. to cyclosporine. J Pharmacokinet Biopharm 1988;16:529-56.
28. Sheiner LB. The population approach to pharmacokinetic data analysis: 55. Mentre F, Escolano S, Diquet B, Golmard JL, Mallet A. Clinical phar-
rationale and standard data analysis methods. Drug Metab Rev 1984;15: macokinetics of zidovudine: inter and intraindividual variability and rela-
153-71. tionship to long term efficacy and toxicity. Eur J Clin Pharmacol
29. Beal SL. Population pharmacokinetic data and parameter estimation 1993;45(5):397- 407.
based on their first two statistical moments. Drug Metab Rev 1984;15: 56. Sidhu JS, Ashton M, Huong NV, Hai TN, Karlsson MO, Sy ND, et al.
173-93. Artemisinin population pharmacokinetics in children and adults with un-
30. Whiting B, Kelman AW, Grevel J. Population pharmacokinetics: theory complicated falciparum malaria. Br J Clin Pharmacol 1998;45:347-54.
and clinical application. Clin Pharmacokinet 1986;11:387- 401. 57. Hossain M, Wright E, Baweja R, Ludden TM, Miller R. Nonlinear
31. Sheiner LB, Beal SL. Bayesian individualization of pharmacokinetics: mixed effects modeling of single dose and multiple dose data for an im-
simple implementation and comparison with non-Bayesian methods. J mediate release (IR) and controlled release (CR) dosage form of alprazo-
Pharm Sci 1982;71:1344-8. lam. Pharm Res 1997;14:309-15.
32. Ette EI, Kelman AW, Howie CA, Whiting B. Analysis of animal phar- 58. Schumitzky A. Nonparametric EM algorithms for estimating prior distri-
macokinetic data: performance of the one point per animal design. J butions. Appl Math Comput 1991;45:141-57.
Pharmacokinet Biopharm 1995;23:551-66. 59. Dodge WF, Jellife RW, Richardson CJ, McCleery RA, Hokanson JA,
33. Ette EI, Sun H, Ludden TM. Balanced designs and longitudinal popula- Snodgrass WR. Gentamicin population pharmacokinetic models for low
tion pharmacokinetic studies. J Clin Pharmacol 1998;38:417-23. birth weight infants using a new nonparamaetric method. Clin Pharma-
34. Grasela TH Jr, Antal EJ, Townsend RJ, Smith RB. An evaluation of pop- col Ther 1991;50:25-31.
ulation pharmacokinetics in therapeutic trials. Part I. Comparison of 60. Kisor D, Watling S, Zarowitz B, Jelliffe RW. Population pharmacokinet-
methodologies. Clin Pharmacol Ther 1986;39:605-12. ics of gentamicin in patients with indicators of malnutrition: the use of a
new nonparametric expectation maximization (NPEM) algorithm. Clin aproximados de los parámetros de PPK que son menos sesgados que los
Pharmacokinet 1992;23:62-8. obtenidos utilizando los métodos naïve o de 2 fases. El software
61. Davidian M, Gallant AR. Smooth nonparametric maximum likelihood NONMEM es el más utilizado para la caracterización de los parámetros
estimation for population pharmacokinetics, with application to quini- farmacocinéticos de población.
dine. J Pharmacokinet Biopharm 1992;20:529-56.
62. Davidian M, Gallant AR. The nonlinear mixed effects model with a Christina Dalmady-Israel
smooth random effects density, Institute of Statistics Mimeo Series No.
2206. Raleigh, NC: North Carolina State University, 1992.
RÉSUMÉ
63. Gallant AR, Nychka DW. Semi-nonparametric maximum likelihood es-
timation. Econometrica 1987;55:363-90. OBJECTIF: Présenter, comparer, et mettre en relief les diverses approches
64. Fattinger KE, Sheiner LB, Verotta D. A new method to explore the dis- permettant l’estimation des paramètres pharmacocinétiques
tribution of interindividual random effects in nonlinear mixed effects populationnels en ce qui concerne les bases mathématiques, les aspects
models. Biometrics 1995;51:1236-51. statistiques, les logiciels de mise en œuvre, et les revendications sous-
jacentes.
MÉTHODE: L’information sur la pharmacocinétique de population a été
repérée par une recherche sur MEDLINE (janvier 1979 à juin 2002) et
une bibliographie d’articles de revues et d’ouvrages. Cette information
EXTRACTO
est utilisée conjointement avec l’expérience pour expliquer les diverses
OBJETIVO: Presentar, comparar, y contrastar varios métodos para estimar approches méthodologiques de la pharmacocinétique de population.
modelos farmacocinéticos de población (PPK) en cuanto a fundación RÉSUMÉ: Différentes méthodes d’estimation sont examinées, avec leurs
matemática, aspectos estadísticos, software para implementación, y avantages et leurs limites, depuis les approches par moyennage de
suposiciones subyacentes. Este artículo sirve como continuación del données brutes et par analyse de données brutes poolées jusqu’aux
anterior (parte I) y como introducción y fundación para el artículo sobre approches de modèles non linéaires à effets mixtes en passant par les
aplicación que sigue (parte III). approches en 2 étapes.
FUENTES DE INFORMACIÓN: Información sobre PPK fue obtenida a partir
CONCLUSIONS: Les méthodes d’estimation de pharmacocinétique de
de MEDLINE (1979–junio 2002) y de las bibliografías de artículos de population qui reposent sur la caractérisation d’effets mixtes (fixes et
repaso y libros. Se usó esta información en conjunto con experiencia aléatoires) sont connues pour produire des estimations des paramètres
para explicar los varios enfoques metodológicos usados con PPK. pharmacocinétiques d’une population moins biaisées que celles
SÍNTESIS: Se evaluan varios métodos para estimar PPK, incluyendo desde obtenues avec les approches utilisant les données brutes ou avec
los enfoques de promediación naïve y estimaciones combinadas hasta l’approche standard en 2 étapes. Le logiciel NONMEM est le plus
los enfoques estándar de 2 fases y métodos de módelos de efectos largement employé pour la caractérisation des pharmacocinétiques de
mixtos no lineales. Se comparan las ventajas y limitaciones de los varios population.
modelos.
Bruno Edouard
CONCLUSIONES: Los métodos de estimación de PPK que dependen de la
caracterización de efectos mixtos (fijos y al azar) producen cálculos