0% found this document useful (0 votes)

3 views17 pages

Combining Domain Knowledge and Statistical Models in Time Series Analysis

This paper presents a novel approach to time series modeling that integrates domain knowledge with statistical techniques, particularly in the context of American option pricing and ecological data analysis. The authors discuss the roles of empirical and substantive models, proposing a basis function approach that combines these elements for improved modeling accuracy. Applications to the Canadian lynx data and option pricing illustrate the effectiveness of this combined methodology.

Uploaded by

Idriss Traoré

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views17 pages

Combining Domain Knowledge and Statistical Models in Time Series Analysis

Uploaded by

Idriss Traoré

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

IMS Lecture Notes–Monograph Series

Time Series and Related Topics

Vol. 52 (2006) 193–209
c Institute of Mathematical Statistics, 2006
DOI: 10.1214/074921706000001049

Combining domain knowledge and

statistical models in time series analysis
arXiv:math/0702814v1 [math.ST] 27 Feb 2007

Tze Leung Lai1 and Samuel Po-Shing Wong2

Stanford University and The Chinese Universty of Hong Kong

Abstract: This paper describes a new approach to time series modeling that
combines subject-matter knowledge of the system dynamics with statistical
techniques in time series analysis and regression. Applications to American
option pricing and the Canadian lynx data are given to illustrate this approach.

1. Introduction

In their Fisher Lectures at the Joint Statistical Meetings, Cox [11] and Lehmann
[31] mentioned two major types of stochastic models in statistical analysis, namely,
empirical and substantive (or mechanistic). Whereas substantive models are ex-
planatory and related to subject-matter theory on the mechanisms generating the
observed data, empirical models are interpolatory and aim to represent the observed
data as a realization of a statistical model chosen largely for its flexibility, tractabil-
ity and interpretability but not on the basis of subject-matter knowledge. Cox [11]
also mentioned a third type of stochastic models, called indirect models, that are
used to evaluate statistical procedures or to suggest methods for analyzing com-
plex data (such as hidden Markov models in image analysis). He noted, however,
that the distinctions between the different types of models are important mostly
when formulating and checking them but that these types are not rigidly defined,
since “quite often parts of the model, e.g., those representing systematic variation,
are based on substantive considerations with other parts more empirical.” In this
paper, we elaborate further the complementary roles of empirical and substantive
models in time series analysis and describe a basis function approach to combining
subject-matter (domain) knowledge with statistical modeling techniques.
This basis function approach was first developed in [29] for the valuation of
American options. In Sections 2 and 3 we review the statistical and subject-matter
models for option pricing in the literature as examples of empirical and substantive
models in time series analysis. Section 4 describes a combined substantive-empirical
approach via basis functions, in which the substantive component is associated with
basis functions of a certain form, and the empirical component uses flexible and
computationally convenient basis functions such as regression splines. The work
of Lai and Wong [29] on option pricing and recent related work in financial time
series are reviewed to illustrate this approach. Section 5 applies this approach to a
widely studied data set in the nonlinear time series literature, namely, the Canadian
1 Department of Statistics, Stanford Univeristy, Stanford, CA 94305, U.S.A., e-mail:

lait@stat.stanford.edu
2 Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong,

e-mail: samwong@sta.cuhk.edu.hk
AMS 2000 subject classifications: primary 62M10, 62M20; secondary 62P05, 62P10.
Keywords and phrases: time series analysis, domain knowledge, empirical models, mechanistic
models, combined substantive-empirical approach, basis function.
193
194 T. L. Lai and S. P.-S. Wong

lynx data set that records the annual numbers of Canadian lynx trapped in the
Mackenzie River district from 1821 to 1934. We use substantive models from the
ecology literature together with multivariate adaptive regression splines to come up
with a new time series model for these data. Some concluding remarks are given in
Section 6.

2. Statistical (empirical) time series models

The development of statistical time series models in the past fifty years has wit-
nessed a remarkable confluence of basic ideas from various areas in statistics and
probability, coupled with the powerful influence from diverse fields of applications
ranging from economics and finance to signal processing and control systems. The
first phase of this development was concerned with stationary time series, leading to
MA (moving average), AR (autoregressive) and ARMA representations in the time
domain and transfer function representations in the frequency domain. This was
followed by extensions to nonstationary time series, either by fitting (not necessarily
stationary) ARMA models or by the Box-Jenkins approach involving the ARIMA
(autoregressive integrated moving average) models and their seasonal SARIMA
counterparts. More general fractional differencing then led to the ARFIMA mod-
els. The next phase of the development was concerned with nonlinear time series
models, beginning with bilinear models that add cross-product terms yt−i ǫt−j to
the usual ARMA model yt = β1 yt−1 + · · · + βp yt−p + ǫt + c1 ǫt−1 + · · · + cq ǫt−q , and
threshold autoregressive and regime switching models that introduce nonlineari-
ties into the usual autoregressive models via state-dependent changes or Markov
jumps in the autoregressive parameters. The monograph by Tong [44] summarized
these and other nonlinear time series models in the previous literature. The appro-
priateness of the parametric forms assumed in these nonlinear time series models,
however, may be difficult to justify in real applications, as pointed out by Chen and
Tsay [9].
Whereas the AR model yt = β1 yt−1 +· · ·+βp yt−p +ǫt is related to linear regression
since β T xt is the regression function E(yt |xt ) of yt given xt := (yt−1 , . . . , yt−p )T ,
and likewise its nonlinear parametric extensions yt = f (xt , β) + ǫt are related to
nonlinear regression, Chen and Tsay [9, 10] proposed to use nonparametric re-
gression for E(yt |xt ) instead. They started with functional-coefficient autoregres-
sive (FAR) models of the form yt = f1 (x∗t )yt−1 + · · · + fp (x∗t )yt−p + ǫt , where
f1 , . . . , fp are unspecified functions to be estimated by local linear regression and
x∗t = (yt−i1 , . . . , yt−id )T with i1 < · · · < id chosen from {1, . . . , p}. Because of sparse
data in high dimensions, local linear regression typically require d to be 1 or 2. To
deal with nonparametric regression in higher dimensions, they considered additive
autoregressive models of the form yt = f1 (yt−i1 ) + · · · + fd (yt−id ) + ǫt , in which the
fi can be estimated nonparametrically via the generalized additive model (GAM)
of Hastie and Tibshirani [19] . Making use of Friedman’s [15] multivariate adap-
tive splines (MARS), Lewis and Stevens [34] and Lewis and Ray [32, 33] developed
spline models for empirical modeling of time series data. Weigend, Rummelhart
and Huberman [48] and Weigend and Gershenfeld [47] proposed to use neural net-
works (NN) to model E(yt |xt ), while Lai and Wong [28] considered a variant called
stochastic neural networks, for which they could use the EM algorithm to develop
efficient estimation procedures that have much lower computational complexity
than those for conventional neural networks.
The preceding time series models are autonomous, relating the dynamics of yt to
Combining domain knowledge 195

the past states. In econometrics and engineering, the outputs yt are related not only
to the past outputs but also to the past inputs ut−d , . . . , ut−k . Therefore the AR
model has been extended to the ARX model (where X stands for exogenous inputs)
yt = β T xt + ǫt with xt = (yt−1 , . . . , yt−p , ut−d , . . . , ut−k )T . Instead of assuming a
linear or nonlinear parametric regression model, one can use nonparametric regres-
sion to estimate E(yt |xt ), as in the following financial application.
Example 1. As noted by Ross [40], option pricing theory is “the most successful
theory not only in finance, but in all of economics.” A call (put) option gives the
holder the right to buy (sell) the underlying asset (e.g. stock) by a certain date
T (known as the “expiration date” or “maturity”) at a certain price (known as
the “strike price” and denoted by K). European options can be exercised only on
the expiration date, whereas American options can be exercised at any time up to
the expiration date. The celebrated Black-Scholes theory, which will be reviewed in
Section 3, yields the following pricing formulas for the prices ct and pt of European
call and put options at time t ∈ [0, T ):

(2.1) ct = St e−d(T −t) Φ(d1 (St , K, T − t)) − Ke−r(T −t) Φ(d2 (St , K, T − t)),
(2.2) pt = Ke−r(T −t) Φ(−d2 (St , K, T − t)) − St e−d(T −t) Φ(−d1 (St , K, T − t)),

where Φ is the cumulative distribution function of the standard normal random

variable, St is the price of the underlying asset at time t, d is the √dividend rate of the
2
underlying asset,√ d1 (x, y, v) = {log(x/y) + (r − d + σ /2)v}/σ v and d2 (x, y, v) =
d1 (x, y, v) − σ v. Hutchinson, Lo and Poggio [22] pointed out that the success of
the formulas (2.1) and (2.2) depends heavily on the specification of the dynamics
of St . Instead of using any particular model of St , they proposed a data-driven way
for pricing and hedging with a minimal assumption: independent increments of the
underlying asset price. Noting that yt (= ct or pt ) is function of St /K and T − t
with r and σ being constant, they assume yt = Kf (St /K, T − t) and approximate
f by taking xt = (St /K, T − t)T in the following models:
P
(i) radial basis function (RBF) networks f (x) = β0 + αT x + Ii=1 βi hi (kA(x −
2 2
γi )k), where A is a positive definite matrix and hi is of the RBF type e−u /σi
or (u2 + σi2 )1/2 ;
P
(ii) neural networks f (x) = ψ(β0 + Ii=1 βi h(γ i +αTi x)), where h(u) = 1/(1+e−u )
is the logistic function and ψ is either the identity function or the logistic
function; P
(iii) projection pursuit regression (PPR) networks f (x) = β0 + Ii=1 βi hi (αTi x),
where hi is an unspecified function that is estimated from the data by PPR.
The αi , βi and γ i above are unknown parameters of the network that are to be
estimated from the data. As pointed out in [22], all three classes of networks have
some form of “universal approximation property” which means their approximation
bounds do not depend on the dimensionality of the predictor variable x; see [2]. It
should be noted that the above transformation of St to St /K can be motivated not
only from the assumption on St but also from the special feature of options data.
Although the strike price K could be any positive number theoretically, the options
exchange only sets strike prices at a multiple of a fundamental unit. For example,
Chicago Board Options Exchange (CBOE) sets strike prices at multiples of $5 for
stock prices in the $25 to $200 range. Also, only those options with strike prices
closet to the current stock price are traded and thus their prices are observed. Since
St is non-stationary in general, the observed K is also non-stationary. Such features
196 T. L. Lai and S. P.-S. Wong

create sparsity of data in the space of (St , K, T − t). Training the options pricing
formula in the form of f (St , K, T − t) can only interpolate the data and can hardly
produce any good prediction because (St , K) in the future can be very different
from the data used in estimating f . The proposed transformation makes use of
the fact that all observed and future St /K are close to 1. Therefore, the proposed
transformation captures the stationary structure of the data and enable the non-
parametric models to predict well. Another point that Hutchinson, Lo and Poggio
[22] highlighted is the measure of performance of the estimated pricing formula.
According to their simulation study, even a linear f (St /K, T −t) can give R2 ≈ 90%
(Table I of Hutchinson, Lo and Poggio [22]). However, such a linear f implies a
constant delta hedging scheme which would provide poor hedging results. Since the
primary function of options is hedging the risk created by changes in the price of the
underlying asset, Hutchinson, Lo and Poggio [22] suggested using, instead of R2 , the
hedging error measures ξ = e−rT E[|V (T )|] and η = e−rT [EV 2 (T )]1/2 , where V (T )
is the value of the hedged portfolio at time T . In a perfect Black-Scholes world,
V (T ) should be 0 if Black-Scholes formula is used. However, from the simulation
study, the Black-Scholes formulas still give ξ > 0 and η > 0 because time is discrete.
Hutchinson, Lo and Poggio [22] reported that RBF, NN and PPR all give hedging
measures comparable to those of the Black-Scholes in the simulation study. For
real data analysis of futures options, RBF, NN and PPR performed better than the
Black-Scholes formula in terms of hedging.
For American options, instead of using these learning networks to approximate
the option price, Broadie et al. [5] used kernel smoothers to estimate the option
pricing formula of an American option. Using a training sample of daily closing
prices of American calls on the S&P100 Index that were traded on the Chicago
Board Options Exchange from 3 January 1984 to 30 March 1990, they compared the
nonparametric estimates of American call option prices at a set of (S/K, t∗ ) values
with corresponding parametric estimates obtained by using the approximations to
American option prices due to Broadie and Detemple [4], and found significant
differences between the parametric and nonparametric estimates.

3. Substantive (mechanistic) models

In control engineering, the dynamics of linear input-output systems are often given
by ordinary differential equations, whose discrete-time approximations in the pres-
ence of noise have led to the ARX models (for white noise), and ARMAX models
(for colored noise) in the preceding section. The problem of choosing the inputs
sequentially so that the outputs are as close as possible to some target values when
the model parameters are unknown and have to be estimated on-line has a large
literature under the rubric of stochastic adaptive control ; see Goodwin, Ramadge
and Caines [16], Lai and Wei [27], Lai and Ying [30] and Guo and Chen [17]. More
general dynamics in the presence of additive noise have led to stochastic differen-
tial equations (SDEs), whose discrete-time approximations are related to nonlinear
time series models described in the preceding section. One such SDE is geometric
Brownian motion (GBM) for the asset price process in the Black-Scholes option
pricing theory. In view of Ito’s formula, the GBM dynamics for the asset price St
translate into SDE dynamics for the option price f (t, St ). Such implied dynamics
from the mechanistic model can be combined with subject-matter theory to derive
the functional form or differential equation for f and other important corollaries of
the theory, as illustrated in the following.
Combining domain knowledge 197

Example 2. In the Black-Scholes model, the asset price St is assumed to be GBM

defined by the SDE

(3.1) dSt /St = µdt + σdwt ,

where wt , t ≥ 0, is Brownian motion. Letting f (t, S) be the price of the option at

time t when St = S, it follows from (3.1) and Ito’s formula that

1 ∂2f
df (t, St ) = ∂f
∂t dt +
∂f 2 2
∂S dSt + 2 ∂S 2 σ St dt
2
∂f ∂f 1 2 2∂ f ∂f
= ∂t + µSt ∂S + 2 σ St ∂S 2 dt + σSt ∂S dwt .

For simplicity assume that the asset does not pay dividends, i.e., d = 0. Consider
an option writer’s portfolio at time t, consisting of −1 option and yt units of the
asset. The value of the portfolio πt is −f (t, St ) + yt St and therefore
∂f ∂f 1 ∂2f ∂f
dπt = − + µSt + σ 2 St2 − µyt St dt + σSt yt − dwt .
∂t ∂S 2 ∂S ∂S
Hence setting yt = ∂f /∂S yields a risk-free portfolio. This is the basis of delta
hedging in the options theory of Black and Scholes [3], who denote ∂f /∂S by ∆.
Besides GBM dynamics for the asset price, the Black-Scholes theory also assumes
that there are no transaction costs and no limits on short selling and that trading
can take place continuously so that delta hedging is feasible. Since economic theory
prescribes absence of arbitrage opportunities in equilibrium, πt that consists of −1
option and ∆ units of the asset should have the same return as rπt dt = r(−f +
St ∆)dt, yielding the Black-Scholes PDE for f :

∂f ∂f 1 ∂2f
(3.2) + rS + σ 2 S 2 2 = rf, 0 ≤ t < T,
∂t ∂S 2 ∂S
with the boundary condition f (T, S) = g(S), where g(S) = (K − S)+ for a put
option, and g(S) = (S − K)+ for a call option, where x+ = max(x, 0). This PDE
has the explicit solution (2.1) or (2.2) with d = 0. If the asset pays dividend at rate
d, then a modification of the preceding argument yields (3.2) in which rS(∂f /∂S)
is replaced by (r − d)S(∂f /∂S).
Merton [37] extended the Black-Scholes theory for pricing European options to
American options that can be exercised at any time prior to the expiration date.
Optimal exercise of the option is shown to occur when the asset price exceeds (or
falls below) an exercise boundary ∂C for a call (or put) option. The Black-Scholes
PDE still holds in the continuation region C of (t, St ) before exercise, and ∂C is
determined by the free boundary condition ∂f /∂S = 1 (or −1) for a call (or put)
option. Unlike the explicit formula (2.1) or (2.2) for European options, there is
no closed-form solution of the free-boundary PDE and numerical methods such as
finite differences are needed to compute American option prices under this theory.
By the Feynman-Kac formula, the PDE (3.2) has a probabilistic representation
f (t, S) = E[e−r(T −t) g(ST )|St = S], and the expectation E is with respect to the
“equivalent martingale measure” under which dSt /St = (r − d)dt + σdwt . This
representation generalizes to American options as the value function of the optimal
stopping problem

(3.3) f (t, S) = sup E[e−r(τ −t) g(Sτ )|St = S]

τ ∈Tt,T
198 T. L. Lai and S. P.-S. Wong

where Tt,T denotes the set of stopping times τ taking values between t and T .
Cox, Ross and Rubinstein [12] proposed to approximate GBM by a binomial tree,
with root node S0 at time 0, so that (3.3) can be approximated by a discrete-
time and discrete-state optimal stopping problem that can be solved by backward
induction. Denote f (t, S) by C(t, S) for an American call option, and by P (t, S)
for an American put option. Jacka [23] and Carr, Jarrow and Myneni [7] derived
the decomposition formula
Z 0n z̄(s) − z
P (t, S) = p(t, S) + Kρeρu e−ρs Φ √
u s−u
(3.4) z̄(s) − z √ o
− θe−(θρs+u/2)+z Φ √ − s − u ds,
s−u

and a similar formula relating C(t, S) to c(t, S), where z̄(u) is the early exercise
boundary ∂C under the transformation

(3.5) ρ = r/σ 2 , θ = d/r; u = σ 2 (t − T ), z = log(S/K) − (ρ − θρ − 1/2)u.

Ju [24] found that the early exercise premium can be computed in closed form if
∂C is a piecewise exponential function which corresponds to a piecewise linear z̄(u).
By using such assumption, Ju [24] reported numerical studies showing his method
with 3 equally spaced pieces substantially improves previous approximations to
option prices in both accuracy and speed. AitSahlia and Lai [1] introduced the
transformation (3.5) to reduce GBM to Brownian motion and showed that z̄(u)
is indeed well approximated by a piecewise linear function with a few pieces. The
integral obtained by differentiating that in (3.4) with respect to S also has a closed-
form expression when z̄(·) is piecewise linear, and approximating z̄(·) by a linear
spline that uses a few unevenly spaced knots gives a fast and reasonably accurate
method for computing ∆ = ∂P/∂S.
The Black-Scholes price involves the parameters r and σ, which need to be
estimated. The yield of a short-maturity Treasury bill is usually used for r. Although
in the GBM model for asset prices which are observed at fixed intervals of time
(e.g. daily), one can estimate σ by the standard deviation of historical (daily)
asset returns, which are i.i.d. normal under the GBM model for asset prices, there
are issues due to departures from this model (e.g., σ can change over time and
asset returns are markedly non-normal) and due to violations of the Black-Scholes
assumptions in the financial market (e.g., there are actually transaction costs and
limits on short selling). Section 13.4 and Chapter 16 of Hull [21] discuss how the
parameter σ in the Black-Scholes option price is treated in current practice. In the
next section we describe an alternative approach that addresses the discrepancy
between the Black-Scholes-Merton theory and time series data on American options
and the underlying stock prices.

4. A combined substantive-empirical approach

In this section we describe an approach to time series modeling that contains both
substantiative and empirical components. We first came up with this approach when
we studied valuation of American options. Its basic idea is to use empirical modeling
to address the gap between the actual prices in the American options market and the
option prices given by the Black-Scholes-Merton theory in Example 2, as explained
below.
Combining domain knowledge 199

Example 3. For European options, instead of using the basis function of Hutchin-
son, Lo and Poggio [22], an alternative approach is to express the option price as
∗
c + Ke−rt f ∗ (S/K, t∗ ), where c is the Black-Scholes price (2.1) because the Black-
Scholes formula has proved to be quite successful in explaining empirical data. This
is tantamount to including c(t, S) as one of the basis functions (with prescribed
weight 1) to come up with a more parsimonious approximation to the actual option
price.
The usefulness of this idea is even more apparent in the case of American options.
Focusing on puts for definiteness, the decomposition formula (3.4) expresses an
American put option price as the sum of a European put price p and the early
exercise premium which is typically small relative to p. This suggests that p should
be included as one of the basis functions (with prescribed weight 1). Lai and Wong
[29] propose to use additive regression splines after the change of variables u =
−σ 2 (T −t) and z = log(S/K). Specifically, for small T −t (say within 5 trading days
prior to expiration, i.e. T − t ≤ 5/253 under the assumption of 253 trading days per
year), we approximate P by p. For T − t > 5/253 (or equivalently, u < −5σ 2 /253),
we approximate P by
Ju
X
ρu
P = p + Ke {α + α1 u + α1+j (u − u(j) )+
j=1
Jz
X
(4.1) + β1 z + β2 z 2 + β2+j (z − z (j) )2+ + γ1 w + γ2 w2
j=1
Jw
X
+ γ2+j (w − w(j) )2+ },
j=1

where ρ = r/σ 2 as in (3.5), α, αj , βj and γj are regression parameters to be

estimated by least squares from the training sample and

(4.2) w = |u|−1/2 {z − (ρ − θρ − 1/2)u} (θ = d/r)

is an “interaction” variable derived from z and u. The motivation behind the cen-
tering term (ρ − θρ − 1/2)u comes from (3.5) that transforms GBM into Brown-
ian motion, whereas that behind the normalization |u|−1/2 comes from (3.4) and
the closely related d1 (x, y, v) in (2.2). The knots u(j) (respectively z (j) or w(j) )
of the linear (respectively quadratic) spline in (4.1) are the 100j/Ju (respectively
100j/Jz and 100j/Jw )-th percentiles of {u1 , . . . , un } (respectively {z1 , . . . , zn } or
{w1 , . . . , wn }). The choice of Ju , Jz and Jw is over all possible integers between 1
and 10 to minimize the generalized cross validation (GCV) criterion, which can be
expressed in the following form (cf. [19, 46]):
n
,( !2 )
X J u + Jz + J w + 6
2
GCV(Ju , Jz , Jw ) = (Pi − P̂i ) n 1− ,
i=1
n

where the Pi are the observed American option prices in the past n periods, and
the Pbi are the corresponding fitted values given by (4.1) in which the regression
coefficients are estimated by least squares.
In the preceding we have assumed prescribed constants γ and σ as in the Black-
Scholes model; these parameters appear in (4.1) via the change of variables (3.5).
In practice σ is unknown and may also vary with time. We can replace it in (4.1)
200 T. L. Lai and S. P.-S. Wong

by the standard deviation σ bt of the most recent asset returns say, during the past
60 trading days prior to t as in [22], p. 881. Moreover, the risk-free rate r may also
change with time, and can be replaced by the yield rbt of a short-maturity Treasury
bill on the close of the month before t. The same remark also applies to the dividend
rate.
The simulation study in Lai and Wong [29] shows the advantages of this combined
substantive-empirical approach. Not only is P well approximated by Pb, especially
over intervals of S/K values that occur frequently in the sample, ∆ b − ∆ also reveals
a pattern similar to that of Pb − P . Besides ξP̂ = E{e −rτ
|VP̂ (τ )|}, where τ is the
time of exercise and VP̂ (t) is the value of the replicating portfolio at time t that
rebalances (according to the pricing formula P̂ ) between the risky and riskless assets
([22], p. 868-869), Lai and Wong [29] also consider the measure
Z τ
(4.3) κP̂ = E ˆ
(St /K)2 (∆(t) − ∆(t))2
dt ,
0

where ∆ ˆ = ∂ P̂ /∂S. In practice, continuous rebalancing is not possible. If rebalanc-

ing is done only daily, then (S/K)2 (∆A − ∆)ˆ 2 in (4.3) is replaced by a step function
that stays constant on intervals of width 1/253. Because of the adaptive nature of
the methodology, the proposed approach of Lai and Wong [29] is much more ro-
bust to the misspecification error than the Black-Scholes formula in terms of both
measures. Lai and Lim [26] carried out an empirical study of this approach and
made use of its semiparametric pricing formula and (4.3) to come up with a modi-
fied Black-Scholes theory and optimal delta hedging in the presence of transaction
costs.

5. Application to the 1821-1934 Canadian lynx data

The Canadian Lynx data set consists of the annual record of the numbers of the
Canadian lynx trapped in the Mackenzie River district of the North-west Canada
for the period 1821-1934 inclusively. Let Xt be log10 (number recorded as trapped in
year 1820 + t) (t = 1, . . . , 114). Figure 1 shows the time series plot of Xt . According
to Tong [44], Moran [39] performed the first time series analysis on these data by
fitting an AR(2) model to Xt ; moreover, the log transformation is used because it
(i) makes the marginal distribution of Xt more symmetric about its mean and (ii)
reduces the approximation error in assuming the number of lynx to be proportional
to the population. In view of the substantial non-linearity of E[Xt |Xt−3 ] found in
the scatterplot of Xt versus Xt−3 , Tong([44], p.361) critiques Moran’s analysis and
its enhancements by Campbell and Walker [6], who added a harmonic component to
the AR(2) model, and by Tong [43], who used the AIC to select the order p = 11 for
AR(p) models, as “uncritical acceptance of linearity” in Xt . He uses a self-excited
threshold autoregressive model (SETAR) of the form
(
0.62 + 0.25Xt−1 − 0.43Xt−2 + εt if Xt−2 ≤ 3.25
(5.1) Xt − Xt−1 =
−(1.24Xt−2 − 2.25) + 0.52Xt−1 + εt if Xt−2 > 3.25

to fit these data, similar to Tong and Lim ([45], Section 9). The growth rate Xt −
Xt−1 in the first regime (i.e., Xt−2 ≤ 3.25) tends to be positive but small, which
corresponds to a slow population growth. In the second regime (i.e., Xt−2 > 3.25),
Xt − Xt−1 tends to be negative, corresponding to a decrease in population size.
Combining domain knowledge 201

3. 5
3.0
X(t)

2.5
2.0

1820 1840 1860 1880 1900 1920

Time

Fig 1. Time series plot of log10 of the Canadian lynx series.

Tong ([44], p. 377) interprets the fitted model as an “energy balance” between the
population expansion and the population contraction, yielding a stable limit cycle
with a 9-year period which is in good agreement with the observed asymmetric
cycles. Motivated by Van der Pol’s equation, Haggan and Ozaki [18] proposed to fit
another nonlinear time series model, namely, the exponential autoregressive model
11
X 2
(5.2) Xt − µ = (φj + πj e−γ(Xt−j −µ) )(Xt−j − µ) + εt ,
j=1

which gives a limit cycle of period 9.45 years. Lim [35] compares the prediction per-
formance of these and other parametric models and concludes that Tong’s SETAR
model ranks the best among them.
Taking a more nonparametric approach, Fan and Yao [14] use a functional −
coefficient autoregressive model to fit the observed Xt series and compare its pre-
diction with that of threshold autoregression. Specifically, they fit the FAR(2,2)
model

(5.3) Xt = a1 (Xt−2 )Xt−1 + a2 (Xt−2 )Xt−2 + σεt

to the first 102 observations, reserving the last 12 observations to evaluate the
prediction. The a1 (·) and a2 (·) in (5.3) are unknown functions which are estimated
by using locally linear smoothers. Fan and Yao ([14], p. 327) plot the estimates â1 (·)
and â2 (·), which are approximately constant for Xt−2 < 2.7 with â1 (Xt−2 ) ≈ 1.3
and â2 (Xt−2 ) ≈ −0.2, and which are approximately linear for Xt−2 ≥ 2.7. For
comparison, Fan and Yao [14] also fit the following SETAR(2) model to the same
set of data:
(
b 0.424 + 1.255Xt−1 − 0.348Xt−2, Xt−2 ≤ 2.981,
(5.4) Xt =
1.882 + 1.516Xt−1 − 1.126Xt−2, Xt−2 > 2.981.
202 T. L. Lai and S. P.-S. Wong

Because of the close resemblance of the fitted SETAR(2) and FAR(2,2), they share
certain ecological interpretations. In particular, the difference of the fitted coeffi-
cients in each regime can be explained by using the phase dependence and the den-
sity dependence in the predator-prey structure. The phase dependence refers to the
difference in the behavior of preys (snowshoe hare) and predators (lynx) in hunting
and escaping at the decreasing and increasing phases of population dynamics, while
the density dependence is the relationship between the reproduction rates of the
animals and their abundance. More discussion on these ecological interpretations
can be found in [42].
To evaluate the predictions of FAR (2,2), Fan and Yao ([14], p. 324) use the
one-step ahead forecast (denoted by X bt ) and the iterative two-step-ahead forecast
(denoted by X̃t ), which are defined by
bt := â1 (Xt−2 )Xt−1 + â2 (Xt−2 )Xt−2 ,
X bt−1 + â2 (Xt−2 )Xt−2 .
X̃t := â1 (Xt−2 )X

The predictions of SETAR(2) are similarly defined. The out-sample prediction ab-
solute errors (|X̂t − Xt | and |X̃t − Xt |) of the last 12 observations are reported
in Table 1. Based on the average of these 12 absolute prediction errors (AAPE),
FAR(2,2) performs slightly better than SETAR(2). Other nonparametric time se-
ries models for the Canadian lynx data include the projection pursuit regression
(PPR) model fitted by Lin and Pourahmadi [36] who found that SETAR outper-
forms PPR in terms of one-step-ahead forecasts, and neural network models which
Kajitani, Hipel and McLeod [25] found to be “just as good or better than SETAR
models for one-step out-of-sample forecasting of the lynx data.”
A substantive approach is adopted by Royama ([41], Chapter 5). Instead of
building the statistical model first and using ecology to interpret the fitted model
later, Royama starts with ecological mechanisms and population dynamics. Letting
Rt = Xt+1 − Xt denote the log reproductive rate from year t to t + 1, he consid-
ers nonlinear dynamics of the form Rt = f (Xt , . . . , Xt−h+1 ) + ut , where ut is a
zero-mean random disturbance, and emphasizes that “our ultimate goal is to deter-
mine the reproduction surface f and to find an appropriate model which reasonably
approximates to it,” with f satisfying the following two conditions in view of eco-
logical considerations: There exists X ∗ such that f (X ∗ , . . . , X ∗ ) = 0, and Rt has
to be bounded above because “no animal can produce infinite number of offspring”

Table 1
Absolute prediction errors of one-step-ahead (1 yr) and iterative two-step-ahead (2 yr) forecasts
and their 12-year average (AAPE).
Model (5.3) Model (5.4) Model (5.6) Model (5.8a)
FAR(2,2) SETAR(2) Logistic Logistic-MARS
Year Xt 1 yr 2 yr 1 yr 2 yr 1 yr 2 yr 1 yr 2 yr
1923 3.054 0.157 0.156 0.187 0.090 0.178 0.075 0.188 0.082
1924 3.386 0.012 0.227 0.035 0.269 0.077 0.281 0.057 0.286
1925 3.553 0.021 0.035 0.014 0.038 0.057 0.153 0.073 0.120
1926 3.468 0.008 0.037 0.022 0.000 0.012 0.077 0.023 0.140
1927 3.187 0.085 0.101 0.059 0.092 0.020 0.018 0.122 0.168
1928 2.723 0.055 0.086 0.075 0.015 0.128 0.098 0.002 0.159
1929 2.686 0.135 0.061 0.273 0.160 0.179 0.004 0.009 0.012
1930 2.821 0.016 0.150 0.026 0.316 0.004 0.216 0.010 0.001
1931 3.000 0.017 0.037 0.030 0.062 0.005 0.010 0.013 0.025
1932 3.201 0.007 0.014 0.060 0.043 0.048 0.042 0.021 0.005
1933 3.424 0.089 0.098 0.076 0.067 0.124 0.184 0.066 0.091
1934 3.531 0.053 0.175 0.072 0.187 0.083 0.245 0.011 0.087
AAPE 0.055 0.095 0.073 0.112 0.075 0.117 0.050 0.098
Combining domain knowledge 203

(see [41], p. 50, 154, 178). In Chapter 4 of [42], Royama introduces the (first-order)
logistic model of f (Xt ) = rm − exp{−a0 − a1 Xt−1 } to incorporate competition over
an available resource. Here rm is the maximum biologically realizable reproduction
rate, i.e. Rt ≤ rm for all t; see [42], Section 4.2.5. An implicit assumption of the
model is that the resource being depleted during a time step will be recovered to
the same level by the onset of the next time step. This assumption can be relaxed
if a linear combination of Xt−j (j = 1, . . . , h) with h > 1 is used in the exponential
term of f , yielding a higher-order logistic model; see [41], p. 153.
Chapter 5 of Royama [41] examines the autocorrelation function and the partial
autocorrelation function of the Canadian lynx series and concludes that h should
be set to 2, which corresponds to the model

(5.5) Xt − Xt−1 = rm − exp{−a0 − a1 Xt−1 − a2 Xt−2 } + ut−1 ,

where rm , a0 , a1 and a2 are unknown parameters that need to be estimated; see

[41], p. 190-191. From the scatterplot of Rt−1 = Xt − Xt−1 versus Xt−2 , Royama
guesses rm ≈ 0.6 and X ∗ ≈ 3. He uses this together with trial and error to obtain
rm , b
the estimate (b a0 , b
a1 , b
a2 ) = (0.597, 2.526, 0.838, −1.508), but finds that the asym-
metric cycle of the fitted model does not match the observed cycle from the data
well. Moreover, the fitted autocorrelation function decays too fast in comparison
with the sample autocorrelation function.
Instead of his ad hoc estimates, we can use nonlinear least squares, initialized at
his estimates, to estimate the parameters of (5.5), yielding

(5.6) Xt − Xt−1 = 0.460 − exp{−3.887 − 0.662Xt−1 + 1.663Xt−2} + ut−1 ,

which implies that the maximum logarithmic reproduction rate is 0.460, i.e., the
population can grow at most 100.46 = 2.884 times per year. Figure 2, top left
5
4

** * *
* **
** ** * **** ***
* * * * ****
* ***
** * * * *** *
* **
3

* *
**** *
x(t 2)

* * * ** *
* * * * *
* * * * *** * *
* * * ** *
*
*** ** * *
* *
** *
2

* * *
**
*
* *
1

1 2 3 4 5

Fig 2. Contour plot of R̂t−1 = Xt − dXt−1 of the logistic model (5.6). The observations are
marked by ∗. The dotted line is Xt−2 = Xt−1 . The intersection of this line and the contour
numbered 0 gives the equilibrium X ∗ .
204 T. L. Lai and S. P.-S. Wong

corner, shows a negative contour of the response surface of the fitted model (5.6).
This implies that the population size can drop sharply in the region Xt−2 > 3.5
and Xt−1 < 2.5, leading to extinction in the upper left part of this region. Whereas
(5.6) does not rule out the possibility of Xt diverging to −∞, extinction occurs as
soon as Xt falls below 0 (or equivalently, the population size 10Xt falls below 1).
Note that one can also derive bounds on the logarithmic reproduction rates
from the empirical approach. Figure 3 is the plot of the limit cycle generated by the
skeleton of the fitted model (5.4). The limit cycle is of period 8 years. The maximum
and the minimum logarithmic reproduction rates, attained at years 1 and 5 in
Figure 3, are 0.212 and -0.269, respectively. That is, the population grows at most
100.212 = 1.629 times per year and diminishes by at most a factor of 10−0.269 =0.538
per year. Moreover, the limit cycle of (5.4) implies an infinite loop of expansion and
contraction and rules out the possibility of extinction. These are consequences of
adopting an empirical approach because the data are distributed along the main
diagonal of Figure 2, but not its top left corner nor its lower right corner. In order to
deduce the behavior of the reproduction rates in these regions, mechanistic modeling
is essential. On the other hand, the empirical approach uses the observed data better
and gives more accurate forecasts. Table 1 compares the prediction performance of
FAR(2,2) and SETAR(2) with that of the logistic model (5.5). The fitted logistic
model provides the worst AAPE of one-step-ahead and iterative two-step-ahead
forecasts. Moreover, instead of characterizing the equilibrium with limit cycles,
the logistic model only gives two equilibrium points, with one corresponding to
extinction and the other equal to X ∗ = {a0 + log(rm )}/(a1 + a2 ) = 3.107 (the
intersection of the line Xt−1 = Xt−2 and the contour of f = 0 in Figure 2.)
We next apply the combined substantive-empirical approach of Section 4 to these
data, using the substantive model (5.5) to provide one of the basis functions in the
3. 4

2 3
10 11
18 19

1
3.2

9 4
17 12
20
x(t)

8
3.0

5
13

7
2.8

6
14
2.6

2.6 2.8 3.0 3.2 3.4

Fig 3. Limit cycle of the skeleton of the SETAR(2) model (5.4). The dotted line is Xt = Xt−1 .
Combining domain knowledge 205

semiparametric model

Xt − Xt−1 = rm − exp{−a0 − a1 Xt−1 − a2 Xt−2 }

(5.7)
+g(Xt−1 , Xt−2 )I{(Xt−1 , Xt−2 ) ∈ S} + ut−1 ,

where g is an unknown function and S is a region containing the observations that

will be specified later. Moreover, the difference equation (5.7) has the boundary con-
straint Xt−1 + rm − exp{−a0 − a1 Xt−1 − a2 Xt−2 } + g(Xt−1 , Xt−2 )I{(Xt−1 , Xt−2 ) ∈
S} ≥ 0. The lynx population becomes extinct as soon as this boundary condition
is violated. Model (5.7) can be fitted by using the backfitting algorithm. Specifi-
cally, model (5.5) is estimated first and then the residuals are used as the response
variable in nonparametric regression on the predictor variable (Xt−1 , Xt−2 ). The
difference between the observed Xt − Xt−1 and the fitted g is then used as the
response variable in (5.5), whose parameters can be estimated by nonlinear least
squares. The algorithm of multivariate adaptive regression splines (MARS) devel-
oped by Friedman (1991) is used for estimating g for the first step in each iteration
of the above backfitting procedure (the function “mars” in the package of “mda”
in R can be used). This kind of iteration sheme has been used in fitting partly lin-
ear models, where the parametric component is a linear regression model and the
nonparametric component is often fitted by using kernel regression; see [8, 13, 20].
The fitted response surface is

Xt − Xt−1 = 1.319 − exp{−0.224 − 0.205Xt−1 + 0.343Xt−2}

(5.8a)
+ ĝ(Xt−1 , Xt−2 )I{(Xt−1 , Xt−2 ) ∈ S} + ut−1 ,
ĝ(Xt−1 , Xt−2 ) = 2.294(Xt−1 − 3.224)+ (Xt−2 − 2.864)+
(5.8b)
− 1.572(Xt−1 − 3.202)+ − 0.851(Xt−2 − 3.202)+.

We evaluate this fitted model by using the out-sample prediction criterion. Table 1
shows that (5.8a) gives the smallest AAPE for one-step-ahead forecasts among all
models considered, and that the AAPE for iterative two-step-ahead forecasts of
(5.8a) is comparable to the smallest one provided by FAR(2,2). The region S in
(5.8a) is chosen to be the oblique rectangle whose edges are defined by the sample
means ±3 standard deviations of the principal components of the bivariate sample
of (Xt−1 , Xt−2 ); see Figure 4 which shows that this region contains not only the
in-sample data but also the out-sample data. Figure 5 gives the contour plot of
the fitted model (5.8a). The logarithmic growth rate at its top left corner is about
−2, which shows a strong possibility of extinction even though the magnitude is
less drastic than that in Figure 2 for (5.6). The inclusion of tensor products of
univariate splines in (5.8a) would have produced positive probability limits of Xt
diverging to ∞ or to −∞ if (Xt−1 , Xt−2 ) had not been confined to a compact
region. On the other hand, with an absorbing barrier at 0 and with (5.8b) only
applicable inside the compact set S, Markov chains of the type (5.8a) not only have
stationary distributions but are also geometrically ergodic under mild assumptions
on the random disturbances ut (e.g., to ensure irreducibity); see [39].

6. Conclusion

In his concluding remarks, Cox [11] noted that for successful use of statistical models
in particular applications, “large elements of subject-matter judgment and technical
statistical expertise are usually essential. Indeed, it is precisely the need for this
combination that makes our subject such an interesting and demanding one.” We
206 T. L. Lai and S. P.-S. Wong

5
4
** * *
* **
** ** * **** ***
* * * * * ****
** * ***
* * * *** *
**

3
* *
**** *
x(t 2)
* * * ** *
*
* * ** * * *
* *** * *
* * * ** *
* * *
* * ** *
* *
** *
2

* * *
**
*
* *
1

1 2 3 4 5

Fig 4. The oblique rectangle S formed by ±3 standard deviations away from the sample means
of the principal components of (Xt−1 , Xt−2 ). The in-sample and out-sample observations are
marked by ∗ and o, respectively.
5

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ** * *
^^^^^^^^^^^^^^^^^^^^^^^^^^
* ** *
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ** ** **
**
*
^^^^^^^^^^^^^^^^^^^^ * *
* *** *
^^^^^^^^^^^^^^^^^^ * * * *
^^^^^^^^^^^^^^^^ * ***
**
^^^^^^^^^^^^^^ * * * ** *
* **
^^^^^^^^^^^^^^^^^ *
3

^^^^^^^^^^ * *
^^^^^^^^ * ** ** ** *
x(t 2)

^^^^^^ *
** * * *
^^^^ * *
* * * *
^^^ * **** *
** * * *
* * ** **
** *
* *
** *
2

* * *
*
*
**
*
1

1 2 3 4 5

d
Fig 5. Contour plot of R̂t−1 = Xt − Xt−1 of the logistic-MARS model (5.7). The observations
are marked by ∗. The shaded region corresponds to extinction.
Combining domain knowledge 207

have followed up on his remarks here with a combined subject-matter and statistical
modeling approach to time series analysis, which we illustrate for the “particular
applications” of option pricing and population dynamics of the Canadian lynx. In
particular, for the Canadian lynx data, we have shown how statistical modeling for
data-rich regions of (Xt−1 , Xt−2 ) can be combined effectively with “subject-matter
judgment” which is the only reliable guide for sparse-data regions.

Acknowledgments

Lai’s research was supported by the National Science Foundation grant DMS-
0305749. Wong’s research was supported by the Research Grants Council of Hong
Kong under grant CUHK6158/02E.

References

[1] AitSahlia, F. and Lai, T. L. (2001). Exercise boundaries and efficient ap-
proximations to American option prices and hedge parameters. J. Comput.
Finance 4 85–103.
[2] Barron, A. R. (1993). Universal approximation bounds for superpositions of
a sigmoid function. IEEE Trans. Information Theory 39 930–945. MR1237720
[3] Black, F. and Scholes, M. (1973). The pricing of options and corporate
liabilities. J. Political Economy 81 637–659.
[4] Broadie, M. and Detemple, J. (1996). American option valuation: New
bounds, approximations, and a comparison of existing methods. Rev. Finan-
cial Studies 9 1121–1250.
[5] Broadie, M., Detemple, J., Ghysels, E. and Torres, O. (2000). Non-
parametric estimation of American options’ exercise boundaries and call prices.
J. Econ. Dynamics & Control 24 1829–1857. MR1784575
[6] Campbell, M. J. and Walker, A.M. (1977). A survey of statistical work
on the McKenzie River series of annual Canadian lynx trappings for the years
1821-1934, and a new analysis. J. Roy. Statist. Soc. Ser. A 140 411–431.
[7] Carr, P., Jarrow, R. and Myneni, R. (1992). Alternative characteriza-
tions of American put options. Math. Finance 2 87–106. MR1143390
[8] Chen, H. (1988). Convergence rates for parametric components in a partly
linear model. Ann. Statist. 16 136–146. MR0924861
[9] Chen, R. and Tsay, R. S. (1993). Functional-coefficient autoregressive mod-
els. J. Amer. Statist. Assoc. 88 298–308. MR1212492
[10] Chen, R. and Tsay, R. S. (1993). Nonlinear additive ARX models. J. Amer.
Statist. Assoc. 88 955–967.
[11] Cox, D. R. (1990). Role of models in statistical analysis. Statist. Sci. 5 169–
174. MR1062575
[12] Cox, J., Ross, S. and Rubinstein, M. (1979). Option pricing: A simplified
approach. J. Financial Econ. 7 229–263.
[13] Engle, R. F., Granger, C. W. J., Rice, J. and Weiss, A. (1986). Semi-
parametric estimates of the relation between weather and electricity sales. J.
Amer. Statist. Assoc. 81 310–320.
[14] Fan, J. and Yao, Q. (2003). Nonlinear Time Series. Springer-Verlag, New
York. MR1964455
[15] Friedman, J. H. (1991). Multivariate adaptive regression splines. Ann.
Statist. 19 1–142. MR1091842
208 T. L. Lai and S. P.-S. Wong

[16] Goodwin, G. C., Ramadge, P. J. and Caines, P. E. (1981). Discrete time

stochastic adaptive control. SIAM J. Control Optim. 19 829–853. MR0634955
[17] Guo, L. and Chen, H. F. (1991). The Åström-Wittenmark self-tuning regu-
lator revisited and ELS-based adaptive trackers. IEEE Trans. Automat. Contr.
36 802–812. MR1109818
[18] Haggan, V. and Ozaki, T. (1981). Modelling non-linear random vibrations
using an amplitude-dependent autoregressive time series model. Biometrika
68 189–196. MR0614955
[19] Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models.
Chapman & Hall, London. MR1082147
[20] Heckman, N. E. (1988). Minimax estimates in a semiparametric model. J.
Amer. Statist. Assoc. 83 1090–1096. MR0997587
[21] Hull, J. C. (2006). Options, Futures and Other Derivatives, 6th edn. Pearson
Prentice Hall, Upper Saddle River, NJ.
[22] Hutchinson, J. M., Lo, A. W. and Poggio, T. (1994). A nonparameric
approach to pricing and hedging derivative securities via learning networks. J.
Finance 49 851–889.
[23] Jacka, S. D. (1991). Optimal stopping and the American put. Math. Finance
1 1–14.
[24] Ju, N. (1998). Pricing an American option by approximating its early exercise
boundary as a multipiece exponential function. Rev. Financial Studies 11 627–
646.
[25] Kajitani, Y., Hipelm, K. W. and McLeod, A. I. (2005). Forecasting non-
linear time series with feed-forward neural networks: A case study of Canadian
Lynx data. J. Forecasting 24 105–117. MR2148983
[26] Lai, T. L. and Lim, T. W. (2006). A new approach to pricing and hedging
options with transaction costs. Tech. Report, Dept. Statistics, Stanford Univ.
[27] Lai, T. L. and Wei, C. Z. (1987). Asymptotically efficient self-tuning regu-
lators. SIAM J. Control Optim. 25 466–481. MR0877072
[28] Lai, T. L. and Wong, S. P. (2001). Stochastic neural networks with applica-
tions to nonlinear time series. J. Amer. Statist. Assoc. 96 968–981. MR1946365
[29] Lai, T. L. and Wong, S. P. (2004). Valuation of American options via basis
functions. IEEE Trans. Automat. Contr. 49 374–385. MR2062250
[30] Lai, T. L. and Ying, Z. (1991). Parallel recursive algorithms in asymptot-
ically efficient adaptive control of linear stochastic systems. SIAM J. Control
Optim. 29 1091–1127. MR1110088
[31] Lehmann, E. L. (1990). Model specification: The views of Fisher and Neyman,
and later developments. Statist. Sci. 5 160–168. MR1062574
[32] Lewis, P. A. W. and Ray, B. K. (1993). Nonlinear modeling of multivariate
and categorical time series using multivariate adaptive regression splines. In
Dimension Estimation and Models (H. Tong, ed). World Sci. Publishing, River
Edge, NJ, pp. 136–169. MR1307658
[33] Lewis, P. A. W. and Ray, B. K. (2002). Nonlinear modelling of periodic
threshold autoregressions using TSMARS. J. Time Ser. Anal. 23 459–471.
MR1910892
[34] Lewis, P. A. W. and Stevens, J. G. (1991). Nonlinear modeling of time
series using multivariate adaptive regression splines (MARS). J. Amer. Statist.
Assoc. 86 864–877.
[35] Lim, K. S. (1987). A comparative study of various univariate time series mod-
els for Canadian lynx data. J. Time Ser. Anal. 8 161–176.
[36] Lin, T. C. and Pourahmadi, M. (1998). Nonparametric and nonlinear mod-
Combining domain knowledge 209

els and data mining in time series: a case-study on the Canadian lynx data.
Appl. Statist. 47 187–201.
[37] Merton, R. C. (1973). Theory of rational option pricing. Bell J. Econ. &
Management Sci. 4 141–181. MR0496534
[38] Meyn, S. P. and Tweedie, R. L. (1993). Markov Chains and Stochastic
Stability. Springer-Verlag, New York. MR1287609
[39] Moran, P. A. P. (1953). The statistical analysis of the Canadian lynx cycle,
I: Structure and prediction. Austral. J. Zoology 1 163–173.
[40] Ross, S. A. (1987). Finance. In The New Palgrave: A Dictionary of Economics
(J. Eatwell, M. Milgate and P. Newman, eds.), Vol. 2. Stockton Press, New
York, pp. 322–336.
[41] Royama, T. (1992). Analytical Population Dynamics. Chapman & Hall, Lon-
don.
[42] Stenseth, N. C., Chan, K. S., Tong, H., Boonstra, R., Boutin, S.,
Krebs, C. J.,Post, E., O’Donoghue, M., Yoccoz, N. G., Forchham-
mer, M. C. and Hurrell, J. W. (1998). From patterns to processes: Phase
and density dependencies in the Canadian lynx cycle. Proc. Natl. Acad. Sci.
USA 95 15430–15435.
[43] Tong, H. (1977). Some comments on the Canadian lynx data. J. Roy. Statist.
Soc. Ser. A 140 432–435.
[44] Tong, H. (1990). Nonlinear Time Series. Oxford University Press, Oxford.
MR1079320
[45] Tong H. and Lim, K. S. (1980). Threshold autoregression, limit cycles and
cyclical data (with Discussion). J. Roy. Statist. Soc. Ser. B 42 245–292.
[46] Wahba, G. (1990). Spline Models for Observational Data. SIAM Press,
Philadelphia. MR1045442
[47] Weigend, A. and Gershenfeld, N. (1993). Time Series Prediction: Fore-
casting the Future and Understanding the Past. Addison-Wesley, Reading, MA.
[48] Weigend, A., Rumelhart, D. and Huberman, B. (1991). Predicting
Sunspots and Exchange Rates with Connectionist Networks. In Nonlinear Mod-
eling and Forecasting (Casdagli, M. and Eubank, S., eds.). Addison Wesley,
Redwood City, CA, 395–432.

Time Series - Brockwell and Davis PDF
No ratings yet
Time Series - Brockwell and Davis PDF
531 pages
Time Series Analysis - Univariate and Multivariate Methods by William Wei PDF
100% (3)
Time Series Analysis - Univariate and Multivariate Methods by William Wei PDF
634 pages
Dynamics of Physical Systems
From Everand
Dynamics of Physical Systems
Robert H., Jr. Cannon
No ratings yet
Time Series Theory and Methods Brockwell PDF
No ratings yet
Time Series Theory and Methods Brockwell PDF
530 pages
Homeschooling Rhetorical Analysis
No ratings yet
Homeschooling Rhetorical Analysis
3 pages
De Revolutionibus Orbium Coelestium
No ratings yet
De Revolutionibus Orbium Coelestium
8 pages
Artificial Neural Networks in Time Series Forecasting: A Comparative Analysis
No ratings yet
Artificial Neural Networks in Time Series Forecasting: A Comparative Analysis
21 pages
Advanced Multivariate Time Series Forecasting Mode
No ratings yet
Advanced Multivariate Time Series Forecasting Mode
8 pages
An Adaptive Hybrid Algorithm For Time Series Prediction in Healthcare
No ratings yet
An Adaptive Hybrid Algorithm For Time Series Prediction in Healthcare
6 pages
From News To Forecast
No ratings yet
From News To Forecast
18 pages
Prosiding AIP - 1.4958542 PDF
No ratings yet
Prosiding AIP - 1.4958542 PDF
8 pages
Time Series Analysis in Python With Statsmodels
No ratings yet
Time Series Analysis in Python With Statsmodels
8 pages
Time Series
No ratings yet
Time Series
19 pages
MATH545-Time Series
No ratings yet
MATH545-Time Series
79 pages
Time Series Data
No ratings yet
Time Series Data
19 pages
A Comparative Study and Analysis of Time
No ratings yet
A Comparative Study and Analysis of Time
7 pages
00 Time Series Analysis - Complete Study Guide
No ratings yet
00 Time Series Analysis - Complete Study Guide
26 pages
Week1 Combined
No ratings yet
Week1 Combined
38 pages
Time Series Economic Forecasting
No ratings yet
Time Series Economic Forecasting
4 pages
Fulltext PDF
No ratings yet
Fulltext PDF
28 pages
Fulltext PDF
No ratings yet
Fulltext PDF
28 pages
Time Series Components:: The Long-Term Direction.: The Periodic Behavior.: The Irregular Fluctuations
No ratings yet
Time Series Components:: The Long-Term Direction.: The Periodic Behavior.: The Irregular Fluctuations
19 pages
Prado West
No ratings yet
Prado West
3 pages
Time Series Data Mining A Case Study With Big
No ratings yet
Time Series Data Mining A Case Study With Big
7 pages
Time Series Analysis in Python With Statsmodels
No ratings yet
Time Series Analysis in Python With Statsmodels
8 pages
Time - Series - in - Brief
No ratings yet
Time - Series - in - Brief
11 pages
Environmental Data Analysis Methods and Applications (Zhihua Zhang) (Z-Library)
No ratings yet
Environmental Data Analysis Methods and Applications (Zhihua Zhang) (Z-Library)
329 pages
Lecture Note 4 - Dynamic Models For Stationary Data
100% (1)
Lecture Note 4 - Dynamic Models For Stationary Data
28 pages
04 Caine10
No ratings yet
04 Caine10
6 pages
TVP Multivaiate Causality
No ratings yet
TVP Multivaiate Causality
65 pages
Math7339TS1TimesSeries Intro
No ratings yet
Math7339TS1TimesSeries Intro
33 pages
Chapter 5 - Time Series Models
No ratings yet
Chapter 5 - Time Series Models
195 pages
MACF Dan MPACF (Tiao - Box1981)
No ratings yet
MACF Dan MPACF (Tiao - Box1981)
16 pages
Vinay Ahlawat
No ratings yet
Vinay Ahlawat
5 pages
STAT 520 Forecasting and Time Series: Lecture Notes
No ratings yet
STAT 520 Forecasting and Time Series: Lecture Notes
311 pages
Intro of Time Series
No ratings yet
Intro of Time Series
18 pages
Chapter Six
No ratings yet
Chapter Six
56 pages
Week4 - 1
No ratings yet
Week4 - 1
18 pages
A Morphological Perceptron With Gradient-Based Learning For Brazilian Stock Market Forecasting
No ratings yet
A Morphological Perceptron With Gradient-Based Learning For Brazilian Stock Market Forecasting
21 pages
Ouyang 2017
No ratings yet
Ouyang 2017
13 pages
Pub - Time Series Theory and Methods PDF
No ratings yet
Pub - Time Series Theory and Methods PDF
530 pages
Characteristics of Time Series
No ratings yet
Characteristics of Time Series
17 pages
Time Series Linear Models
No ratings yet
Time Series Linear Models
121 pages
5 - Unit Root-stationary-SIC 2
No ratings yet
5 - Unit Root-stationary-SIC 2
61 pages
Efficient Bayesian Inference For AFRIMA Processes
No ratings yet
Efficient Bayesian Inference For AFRIMA Processes
33 pages
Hybridization of ARIMA With Learning Models For Forecasting of Stock Market Time Series
No ratings yet
Hybridization of ARIMA With Learning Models For Forecasting of Stock Market Time Series
51 pages
Module 3.1 Time Series Forecasting ARIMA Model
No ratings yet
Module 3.1 Time Series Forecasting ARIMA Model
19 pages
Christian Gourieroux, Alain Monfort - Time Series and Dynamic Models (Themes in Modern Econometrics) (1996)
75% (4)
Christian Gourieroux, Alain Monfort - Time Series and Dynamic Models (Themes in Modern Econometrics) (1996)
685 pages
Time Series A Data Analysis Approach Using R by Robert Shumway, David Stoffer
No ratings yet
Time Series A Data Analysis Approach Using R by Robert Shumway, David Stoffer
272 pages
Time Series Previewpdf
No ratings yet
Time Series Previewpdf
30 pages
Unit 4
No ratings yet
Unit 4
24 pages
Causal Inference For Time Series Analysis - Problems, Methods and Evaluation
No ratings yet
Causal Inference For Time Series Analysis - Problems, Methods and Evaluation
45 pages
Econometrics 2 Notes
No ratings yet
Econometrics 2 Notes
12 pages
Introduction To Time Series Analysis and Forecasti
No ratings yet
Introduction To Time Series Analysis and Forecasti
10 pages
Learning Graphical Models For Stationary Time Series: Fbach@cs - Berkeley.edu Jordan@cs - Berkeley.edu
No ratings yet
Learning Graphical Models For Stationary Time Series: Fbach@cs - Berkeley.edu Jordan@cs - Berkeley.edu
20 pages
Stat720 Notes
No ratings yet
Stat720 Notes
150 pages
Bej1906 004r2a0 PDF
No ratings yet
Bej1906 004r2a0 PDF
35 pages
Class Notes
No ratings yet
Class Notes
6 pages
SSRN 1268910
No ratings yet
SSRN 1268910
19 pages
MixMamba Time Series Modeling With Adaptive Expertise
No ratings yet
MixMamba Time Series Modeling With Adaptive Expertise
13 pages
Time Series
No ratings yet
Time Series
67 pages
Modern Nonlinear Equations
From Everand
Modern Nonlinear Equations
Thomas L. Saaty
3.5/5 (2)
EE BSC Thesis UWB SAR Data Antenna
No ratings yet
EE BSC Thesis UWB SAR Data Antenna
32 pages
Curriculum 2018-00016 1
No ratings yet
Curriculum 2018-00016 1
2 pages
What Is LG3
No ratings yet
What Is LG3
2 pages
Eitan, Granot
100% (1)
Eitan, Granot
37 pages
Workbook Robotics 15
100% (1)
Workbook Robotics 15
111 pages
PHD Thesis: STATIC FRICTION IN RUBBER-METAL CONTACTS WITH APPLICATION TO RUBBER PAD FORMING PROCESSES
100% (1)
PHD Thesis: STATIC FRICTION IN RUBBER-METAL CONTACTS WITH APPLICATION TO RUBBER PAD FORMING PROCESSES
183 pages
Ds MCQ
No ratings yet
Ds MCQ
38 pages
Solid Waste Management - Mismanagement From Houseboats of Dal Lake: Assessing Strategies For Effective Waste Reduction and Resource Recovery
No ratings yet
Solid Waste Management - Mismanagement From Houseboats of Dal Lake: Assessing Strategies For Effective Waste Reduction and Resource Recovery
7 pages
XHHW Class X 2024-25
No ratings yet
XHHW Class X 2024-25
6 pages
38.dynamic Analysis of Multi-Storey RCC Building
No ratings yet
38.dynamic Analysis of Multi-Storey RCC Building
7 pages
The Market Research Matrix Worksheet
No ratings yet
The Market Research Matrix Worksheet
12 pages
Tourism Costs and Benefits
No ratings yet
Tourism Costs and Benefits
16 pages
Magnetism Demo 171022065922
100% (2)
Magnetism Demo 171022065922
26 pages
Hardware Manual of A3
No ratings yet
Hardware Manual of A3
26 pages
Knowledge-Based Standard Progress Measurement For Integrated Cost and Schedule Performance Control
No ratings yet
Knowledge-Based Standard Progress Measurement For Integrated Cost and Schedule Performance Control
12 pages
Evolution of Demarketing in The Tourism Industry and Implications For Sustainability - IJISRT19OCT2080 PDF
No ratings yet
Evolution of Demarketing in The Tourism Industry and Implications For Sustainability - IJISRT19OCT2080 PDF
6 pages
06muniapanthirukkural 120426040150 Phpapp02
No ratings yet
06muniapanthirukkural 120426040150 Phpapp02
19 pages
MD Risul Haque Rahat
100% (1)
MD Risul Haque Rahat
2 pages
SBI Clerk Mains Result 2016 Declared!!!
No ratings yet
SBI Clerk Mains Result 2016 Declared!!!
9 pages
Management Representative
No ratings yet
Management Representative
3 pages
(30 Jan C) Rancangan Pengajaran Slot
No ratings yet
(30 Jan C) Rancangan Pengajaran Slot
4 pages
NOTES From TBBOTC Harry Binswanger
No ratings yet
NOTES From TBBOTC Harry Binswanger
3 pages
Math 101 Test 2-2020 - 20210926 - 0001
No ratings yet
Math 101 Test 2-2020 - 20210926 - 0001
2 pages
Change Your Beliefs
100% (5)
Change Your Beliefs
16 pages
Bedroom Lighting Test
No ratings yet
Bedroom Lighting Test
6 pages
Stylus Remote Control Guide v1.1
No ratings yet
Stylus Remote Control Guide v1.1
11 pages
Lecture # 38
No ratings yet
Lecture # 38
16 pages
Cloze Test - Study Notes PDF
No ratings yet
Cloze Test - Study Notes PDF
11 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Combining Domain Knowledge and Statistical Models in Time Series Analysis

Uploaded by

Combining Domain Knowledge and Statistical Models in Time Series Analysis

Uploaded by

IMS Lecture Notes–Monograph Series

Time Series and Related Topics

Combining domain knowledge and

Tze Leung Lai1 and Samuel Po-Shing Wong2

2. Statistical (empirical) time series models

where Φ is the cumulative distribution function of the standard normal random

3. Substantive (mechanistic) models

Example 2. In the Black-Scholes model, the asset price St is assumed to be GBM

(3.1) dSt /St = µdt + σdwt ,

where wt , t ≥ 0, is Brownian motion. Letting f (t, S) be the price of the option at

(3.3) f (t, S) = sup E[e−r(τ −t) g(Sτ )|St = S]

(3.5) ρ = r/σ 2 , θ = d/r; u = σ 2 (t − T ), z = log(S/K) − (ρ − θρ − 1/2)u.

4. A combined substantive-empirical approach

where ρ = r/σ 2 as in (3.5), α, αj , βj and γj are regression parameters to be

(4.2) w = |u|−1/2 {z − (ρ − θρ − 1/2)u} (θ = d/r)

where ∆ ˆ = ∂ P̂ /∂S. In practice, continuous rebalancing is not possible. If rebalanc-

5. Application to the 1821-1934 Canadian lynx data

1820 1840 1860 1880 1900 1920

Fig 1. Time series plot of log10 of the Canadian lynx series.

(5.3) Xt = a1 (Xt−2 )Xt−1 + a2 (Xt−2 )Xt−2 + σεt

(5.5) Xt − Xt−1 = rm − exp{−a0 − a1 Xt−1 − a2 Xt−2 } + ut−1 ,

where rm , a0 , a1 and a2 are unknown parameters that need to be estimated; see

(5.6) Xt − Xt−1 = 0.460 − exp{−3.887 − 0.662Xt−1 + 1.663Xt−2} + ut−1 ,

2.6 2.8 3.0 3.2 3.4

Xt − Xt−1 = rm − exp{−a0 − a1 Xt−1 − a2 Xt−2 }

where g is an unknown function and S is a region containing the observations that

Xt − Xt−1 = 1.319 − exp{−0.224 − 0.205Xt−1 + 0.343Xt−2}

[16] Goodwin, G. C., Ramadge, P. J. and Caines, P. E. (1981). Discrete time

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.