0% found this document useful (0 votes)
42 views85 pages

Thesis 1973D H723b

This document is Donald Holbert's 1973 PhD thesis from Oklahoma State University titled "A Bayesian Analysis of Shifting Sequences with Applications to Two-Phase Regression". The thesis analyzes shifting sequences where the first m observations come from one distribution and the remaining observations come from a different distribution. It develops posterior distributions for parameters related to these shifting sequences and two-phase regression models. The thesis also discusses inference procedures that can be applied to the posterior distributions such as point estimation, hypothesis testing, and constructing regions of highest posterior density.

Uploaded by

Ahmed HAMIMES
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views85 pages

Thesis 1973D H723b

This document is Donald Holbert's 1973 PhD thesis from Oklahoma State University titled "A Bayesian Analysis of Shifting Sequences with Applications to Two-Phase Regression". The thesis analyzes shifting sequences where the first m observations come from one distribution and the remaining observations come from a different distribution. It develops posterior distributions for parameters related to these shifting sequences and two-phase regression models. The thesis also discusses inference procedures that can be applied to the posterior distributions such as point estimation, hypothesis testing, and constructing regions of highest posterior density.

Uploaded by

Ahmed HAMIMES
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

A BAYESIAN ANALYSIS OF SHIFTING

SEQUENCES WITH APPLICATIONS

TO TWO-PHASE REGRESSION

By

.. DONALD HOLBERT
.......

. Bachelor of Science
. '(Jniversity of Oregon
Eugene, Oregon
1967

'Master of Arts
Washington State University
· P\1ll:ma.n, Washingten
196'9

Submitted to .the Jfac,ulty of the Gradu.ate College


·of the Oklahoma State University
·in partial fulfillment of the reei.uirements
· for the Degree of
· DOCTOR OF PHILOSOPHY
Maya 1973
.- •• f' .

.
j),~
\<l\1'& Q
\~ 1 'l.~ b
t-.1.

.'
.:.. .
OKLAHOMA
STATE UNIVERSITY
UBRARY

FEB 15 1974

A BAYESIAN ANALYSIS OF SHIFTING

SEQUENCES WITH APPLICATIONS

TO TWO -PHASE REGRESSION

Thesis Approved:

873300
ACKNOWLEDGMENTS

I wish to express my thanks and appree·lation to Professor Lyle

Broemeling for suggesting the topic and for his guidance and encourage.

ment in the preparation of tMs the sis.

I wish to thank Professol;'s Leroy Folks, Odell Walker and David

Weeks for serving on my aQ.vlsory c;:ommittee, and also to thank all of

the teachers that I have had during the course of n;iy graduate study.

Dr. Robert Morrison has taught me a gJ;"eat deal about what

might be called 11 hard~co+e 11 data analysis, and I want to thank him for

that,

For the friendship and help they have given me during the course

of my graduate study, ~hanks go to my fellow students and especially

to Ceal.

I would also like to thank Mrs. Mary Bonner for the first rate

job she has done in typing this thesis.

There are many other people to whom I owe a vote of thanks for

various contributions they have made, and since space limitations

make this prohibitive I would like to thank collectively all of those who

I have not mentioned individually.

Finally, for the sacrifices they made to provide me with a high

school education, a sincere thank you to my mother and late father, to

whom I owe muqh more than I can put into words.


TABLE OF CONTENTS

Chapter Page

I. INTRODUCTION AND LITERATURE REVIEW 1

Introduct~on . 1
Review Of Some Related Literature , 3
Organization Of Thesis 6

II. POSTERIOR DISTRIBUTIONS RELATED TO THE


NORMAL SEQUENCE 8

c!>o And c)> 1 Both Known 10


Only One Of c!>o Or cp 1 Known 13
Neither cp 0 Nor cp 1 Known. 21

III. POSTERIOR DISTRIBUTIONS RELATED TO TWO-


PHASE REGRESSION FOR THE DlSCRET E CASE 30

Error Varian<;;e Known, First Regression


Known 31
Error Variance Known, Neither Regression
Known , , 36
Error Varianc;e Unknown, First Regression
Known , 38
Error Variance Unknown, Neither Regression
Known 39
IV. POSTERIOR DISTRIBUTIONS RELATED TO TWO-
PHASE REGRESSION FOR THE CONTINUOUS
CASE 42
2
CT Known, First Regression Known, m Known 43
CT2 Known, First Regression Known, m
2 Unknown . . 45
CT Known, Neither Regression Known, m
2 Known . . . 47
CT Known, Neither Regression Known, m
2 Unknown . 51
CT Unknown, First Regression Knownf m
2 Known , , , 51
CT Unknown, First Regression Known, m
Unknown 55
Chapter Page
2
o- Unknown, Neither Regression Known,
2 m Known . , , 56
o- Unknown, Neither Regression Known,
m Unknown 59

V. INFERENCE PROCEDURES AND SOME EXAMPLES 60

Plot Of The Posteriol;' Density 60


Point Estimators For Parameters 63
Regions Of Highest Posterior Density 64
Hypothesis TestLng 64
Other Techniques 66
An Ad Hoc Technique 67
Some Gen~ral Comments 70

VI. SUMMARY AND POSSIBLE EXTENSIONS 72

Summary Of The Study 72


Some Possible Extensions 73

A SELECTED BIBLIOGRAPHY . 75
LIST OF TABLES

Table Page

I. Data and Resulting Posterior Densities 61

LIST OF FIGURES

Figure Page

1. Plot of Posterior Density 1T


u
(m) 62
2. Posterior Density (4. 22): Second Regression Known 68

3, Posterior Densi,ty (4, 28) : Neither Reg:re s sion Known . 69


CHAPTER I

INTRODUCTION AND LITERATURE REVIEW

Introduction

Whenever observations are taken in an ordered sequence it can

happen that the complete data set <;:<;tn be divided into subsets in a well

defined way. Each observation :may be regard,ed as coming from a

parameterized fq.mily of dLstributions; those observations in a specified

subset correspond to some parti<;;ular value of the parameter, the

parameter value changing from subset to subset.

One of the more widely known occ:;urrences of this situation is in

industry where one is interested in the qµaUty of a prodµct from a

continuous production proc;:ess. For a specific:; example, let us suppose

that the 11 Sure strike Matc;:h Company 11 produces boxes of kitchen

matches, and that the average content is advertised as fifty matches.

It seems c;:lear enough that the company would like to know if a

change oc::<;:urs in IIhe average contenlI of a box of matches. If this

average increases the c;:ompany is glving away free matches, while if

it decreases customers may well suspect the company of false adver-

tising and take their business elsewhere, It also seems clear that if a

c;hange in the average content has occurred,, interest would center on

the time point in the sequence at whic;h the change occurred. This

knowledge may allow the company to recover the faulty product and

correct the fault before di13tribution oc::curs.


2

Many other examples come to mind, ln biological studies, the

onset of a disease at some point in time may resutt in a redu~ed growth

rate; the application of a treatment may inhibit the response to some

stLmulus; or repeated condit'ioning in a psychological experiment may

cause a change in the proportion of c;orrect answers given by a subject.

There is a qons·Lde:rable body qf literature on the various problems

of estimation and inference associated with parameters changing over

tLme, and the references given in the bibliography are by no means

exhaustive. Most of the recent studies have been from the classical

frequency theory viewpoint. The present study is set in a Bayesian

framework.

We shall now formulate the problem in mathematical terms and

then review some of the recent papers on its va.rious aspec;ts,

We assume that we have observations on a finite sequence of

random variables x 1 , . , . , :x;T ~ and that for some m, 1 < m ~ T,

x 1 , ... , Xm are independently distr~buted with dens Hy f(x; e1 ) while

Xm+ 1 , ... , XT are independently distributed f(x ; e2 ) . We assume of

course that e 1 # e2 and that (X 1 ,, , , , Xm) is independent of

(Xm+ 1 , ... , XT), We refer to this point m Ln the sequence as the

"shift point" or 11 switch point 11 , sinc;e the random variables up to the


ili . ili
m correspond to one parameter value whi.le those after the m

correspond to an.other parameter value. The problem may be

generalized to the case where el and ez a;re p-component vectors.

The shift is then from one point p-space to another, rather than a

shift along the real line.

It is clear that if m =T then no shift has occurred. This

enables us to identify two main problems of interest with suc;h a


3

sequence, The first ~s a detecticm problem, That is, has there been

a para~eter change in the sequence of random variables? The sec;:ond


""'"'"""'"'-,.,..,._,_,_

problem is an e st'imation problem, Assuming a shift has occurred, at

what point in the sequence did it occur? Along with these are the

problems of estimating current and previous parameter values, and

perhaps testing hypotheses about them. The present study addresses

itself only to the second of these two main problems.

Review Of Some Related Literature

In a series of papers Page (1, 2, 3) discusses the problem of

detec;:ting a parameter change, and proposes a number of tests to this

end, These are based primarlly on the cumulative i;ums


r
S where a is the known initial mean. If there is no
= I: (X,, - 8),
ri= 1 L
change the mean path of the cumulative sums is a horizontal path;

E(S )
r
= 0.
Quandt (4) dlscusse S a maxb:X),Uffi likelihood technique for

estimatLng the switch point and the regression parameters in a two-

phase regression. This is a generalization of the one parameter case

discussed earlier. We assume here that the observations Y.'


!

i = 1,.,,, m are distributed N(a 1 +13 1 Xi, er 2 ) while the Yj,


j = m+l,,,., T are distributed N(a 2 +13 2 Xj, er 2 ) , Quandt's tec;hnique
involves evaluating the likelihood, function at each of the possible switch

points. Be also disc:\lsses in this article and a later paper (5) several

tests of the hypothesis of no switch against the alternative of a single

swHch,

Sprent (6) outlines a hierarchy of possible hypotheses of interest

related to two-phase regression, and suggests that the result of an


4

initial investigation should indicate the nex~ hypothesis to ~e considered,

His tests, however, are based on the assumption that one knows

between wh'ich two of the independent variables the switch occurs, The

switch point is then the abscissa of the point at which the intersection

of the two regression Hnes oc;;c;urs 1 namely y = (a 1 - a 2 )/(!3 2 - 13 1 ).

Chernoff and Zacks (7) study Bayesian procedures for estimating

the current mean (Le., the mean of XT) in an observed sequence

X 1 , ... , XT of normally distributed random variables which has been

subjected to occasional changes in the mean, Their est'lmator requires

many complex computations except when the assumption of at most one

change in the mean ls made. A test is also given for the null hypothesis

of no shift against the alternative of exactly one sMft, and its power

for certain alternatives is compared to that of the test proposed by

Page (2). This ·Ls general,ized in a later paper by Kander and Zacks

(8) to the case where the distributions of the X.l 's belong to the one

parameter exponential famUy :rather than the normal family in

particular. The paper by Bhattacharyya and Johnson (9) derives

certain optimal tests of the hypothesis just stated, their optimality

criterion being the maximization of loc;:al average power,

Brown and Our bin ( l 0) discuss methods for invesHgating whether

a regression relationi;hip is cons~aq~ over time. Most of their tec;h-

niques are graphical in nature, along the lines suggested by Tukey (11)~

These indude plotting the residuals from a single regression fitted to

the entire data set, as weH as plotting the cumulative sums of

residuals, in line with the cu sum technique of Page ( 1) . A further

technique they discuss is that of plotting the recursive residuals


/\ 6
t = 3, ... , T, . where Q:' t-1 and !3t -1 are
5

the least squares estimates of the regress~on paramete:u based on the

first t -1 observations. These quantities ~an easi~y be normed in

suc:h a way that under they hypothes·Ls of no shift t4ey are independently

distributed N(O, 0" 2 ). Other useful plots are the cumulatlve sums of

the recursive residuals and the cumuLati.ve sums of squares of recur-

sive residuals, each plotted against the time points t.

Finally we come to some of the more recent papers on shifting

sequenc;es of random variables. With regard to two-phase regression

in particular, D. V. Hinkley has been a major contributor to. the recent

literature,

In a paper published ln 1969, Hinkley (12) d.escrlbes a method

for finding a max;imum likelihood e stim!=J,te of the abse'Ls sa of the inter -

section point of the two regresi;ilon Hnes, '( = (a 1 - a 2 )/(13 2 - 13 1 ).


This involves finding T -3 ~ondiHpn.al likeli:P,ood functlon·s, each of

which ha e to be maximized, and then m~ximi~ ing aver all T .... 3

functions. This estimate is difficult to work with in that H: has no

explicit definition,, He also proposes a likelihood ratio test for a null

hypothesis of the form H : '( = '( . In his 1971 article, Hinkley (13)
0 0
parameterizes the problem a little differently. Be ass\,l.mes that Yi,

i = 1,,,., m are independenHy distributed N(0 t 13 1 (Xi - '(), 0"


2 ) while

the Y. ~ j = m+l,,.,, T are independently distributed


J
N(0 + !32 (Xj-y), CT
2 ), with xm ~" ~ xmtl' " being the abscissa of
the interseqtion point 0£ the two regression lines, He then centers

. interest on estimation and inference procedures related to 0 and '(.

Maximum likelihood estimation of e and '( is studied under the

assumption that 132 is unknown and also under the assumption that

13z = O. Likelihood ratio tests for H 0 : 13 1 = 132 and H 0: 132 = O are


6

. discussed, He also derives a confidence region for 'I and describes

a technique for constructing a joint confidence region for e and 'I.

Hinkley' s remaining papers are concerned with a change in the

mean of an observed sequence of random variables. In a 1970 paper

(14) he discusses maximum likelihood estimation of the shift point m

in a normal sequence whose mean has been subjected to one change.

He also derives the asymptotic distribution of the likelihood ratio test

statistic for tests of hypotheses about m. Similar problems related

to a binomial sequence are studied in a paper by Hinkley and Hinkley

( 15) .

A Bayesian approach to the problem of estimating the shift point

in an observed sequence of random variables has been given by

Broemeling (16). He derives posterior densi~ies for the shift point

parameter in the case of a Bernoulli sequence, a sequence of

exponentially distributed random variables, and a normal sequence

wHh known variance. In another paper Broemeling (17) discusses

Ba ye s~an procedures for fir st dete<;;ting the presence or absence of a

parameter change, and then ma king inferences after this initial

decision has been made, The detection problem is handled in terms of

a "posterior odds, ratio" in favor of the null hypothesis, while inference

procedures are as usual based on an appropdate posterior distribution.

Organization Of The sis

We shall now describe briefly the c::;ontent of this thesis. in rela-

tion to the literature we have just discussed.

In Chapter II we derive a number of posterior distributions which


~·r '°'l

n;±tt:y,-be used for inference about a normal sequence with unknown


/"'.. );~,''"
,·,t·"···
7

variance. This is an extension of the paper by Broemeling (16).

Chapters III and IV are related to two-phase regression, which

may be considered as a generaLization of the shift'Lng normal sequence.

In particular, Chapter III addresses itself to the problem of estimating

the shift index m, while in Chapter IV we derive posterior distribu-

tions related to estimation of the abscissa of the point of intersection

of the two regression lines. These chapters present an alternative to

the analyses given by Quandt (4, 18), Sprent (6), Hinkley (12, 13) and

others.

A brief survey of Bayesian estimation and inference techniques· is

presented in Chapter V, which includes examples and applications of

some of our earlier resuLts.

Chapter VI summar·Lzes the results of this report and discusses

some possibilities for future investtgations.


CHAPT;ER II

POSTERIOR DISTRIBUTIONS RELATED TO

THE NORMAL SEQUENCE

In this chapter we shall c;ierive some pos~e:l;"ior dishributions

related to the problem of est"Lmating the time point at which a shift in

the mean occurs ~n a finite sequenQe of observations on normally

distributed random var·lables,

More specifically, we assume that we h~ve observed a sequence

x 1, ... , Xn, n ?:.. 3, of independent random var"lables, and that for

some unknown m (m = 1, •.. , n - 1) ~he dlf!tributions of the X.'s are


l

g·Lven by:

and

xm+l' ' .. 'xn are ~. ·i, d.

2
w'lth er > O and <Po f. cp 1 •
The case in which cr 2 is assumed known has been studied by

Broemeling (16), We consider here only the c;ase where cr 2 is

There are a number of subcases fol;' the iitbove problem which we

shall consider:
(i) <Po , <P 1 , both known

(ii) Only one of <Po or cp 1 known


(iii) Ne'lther <Po nor cp 1 known ,
9

A further eon side ration is whether or not ~he direction of the

shift is known, and this will be disqus sed· in the following seetions.

While the main emphasis "in this paper is on the estimation of the

shVt point m, posterior distributions will be derived for some of the

other unknown parameters.

In order to attac;:k the problem from a Bayesian viewpoint we shall

consider m to be a dlscrete random variable with state space

I
n- 1
· = {l, ...• n -1} . The parameters fo~ our problem are now m,

cr2 <Po, and cp 1 , and pr~or distributions will have to be as signed to


'
those that are unknown in each of the cases conslc;lered 1

In th,e development glven in this and later chapters we shall

assign independent prior distributions whi<;h may be considered

appropriate for situations wher19 prior knowledge is vague, More


2
precisely, we shai1 as sign in every 'case for m and CT the prior

dens Hies:

= { 1/(n-l), m=l,,,.,n-1
1To (m) 0
elsewhere, and

1 /cr 2 , 0' 2 > o


{0
elsewhere ,

2
The pric:n• on er ·is of course an improper density, It has been

widely used to indicate vague prior know Ledge of the var·~ance. Its

form is suggested by a number of approa~he s, and the reader is

referred to Lindley (19) and Je£freyc;; (20) for a discussion of these,

We now cop.sider in turn the three cases mentioned above.


10

<1> 0 And <1> 1 Both Known

The likelihood function for m and 0' 2 is

2
L(m, O" )

2
where 0 < CT < ro and m belongs to In accorda.nc;e with
2
Bayes theorem the joint posterior d'lstribut'lon of m (;l.nd CT is

2 2 2
ir 1 (m, CT ) cc L(m, CT ) ir 0 (m) ir 0 (0' )

cc (0'2)-(n/2+1) exp{(-1/2(!"2)[.m.I;
1=1
(Xi-~0)2+. ~ 1=mtl
(X1-<l>1)2J}

(2. 2)
2
where 0 < er < oo and m belongs to I ,
n- 1

Infe:ren~e About m

This may be based on the marginal posterior density of m,

which is glven by
11

Letting w ; : : 1 /rr 2 , the integrand i~ seen to have the form of a

gamma density, and is easily eva~uat~d to give

We qan thus write

m 2 n 2]-n/2
ir 1 (m)cx: [ ~(X.-cp) +.~ (Xi-cp 1 )· m=l,, .. ,n-1.
i=l 1 O i=m+l
(2. 3)

The norming constant may be found by summing on rn.

Inference About rr 2

2
This may be based on the marginal posterior distribution of O' '

which is given by

2 n-1 2
ir 1 (cr ) iii: ~ ir 1 ( m, O' . )
m=l

n-1 2 -(n/2+1) 2 2
ex: ~ (er exp {(-1/20- )K(m)}, 0 <er <co \2,4)
m=l

where

If we now let w = 1 /cr 2 we can write the posterior density of w as

n-l n/2 -1
ir 1 (w) ex: ~ w exp{-wK;(m)/2}, O<w<oo, (2.5)
m=l ·

Apart from the norming consfuan~~ this is the sum of n - 1

gamma densities with parameters a


m
= n,/2 and 13m = 2 /K(m) . The
12

poster'Lor density of w = l/CJ 2 is thus a r:pixh:i.re of gamma densities,

We may also make the equivalent shatement that the posterior density

of CJ 2 is a mixture of inverted gamma densities.


2
An alternative Q.istribution on which inference about CJ can be

based is the c;onditional poster~or dish:i;-ibution, ;rl (CJ 2 jm). One could

center inte re sh on this distribution evaluated at, say, a modal value of

the posherior density of m r In hhis case we have

;r 1 (CJ 2 jm) = ;r 1 (CJ 2 ,m),/7r 1 (m)

ex: (CJz)-(n/ 2 +l)exp{(-l/2CJ 2 )K(m)}/[K(m)rn/ 2 . (2.6)

It ls now clear that, for each ftxed m, this conditional density is

inverted gamma with parameters a


'm
= n/ 2 and A
t-'m
= 2 / K (m) .

Before proceeding to the next case we should perhaps remark on

the mixing density which O(!;curs in hhe posterior distribution of

w = 1 I CJ 2 , If we le t

f m (w; n/2, K(m)) = wn/ 2 - 1 exp {. -w K(m) /2}

then we can writ~ the posterior density of w (see (2. 5)) as

n -1
;r 1 (w) a: ~ f (w; n/2, K(m)) (2. 7)
m=l m

The norming constant K is now given by

K !
=
co n-1
L: r(n/2) [2/K(m)r 12/ r(n/2) [2/K(m)r 12 fm(w; n/2, K(m))dw
0 m=l

n-1
= r(n/2) 2n 12 L: [K(m)rn/ 2 .
m=l
13

Referring to (2. 7) and inserting the norming c:onstant K we see

that the mth mixing constant for the posterior dens'lty of w is

K(m)-n1 2/ nil [K(m)rn/ 2 • By referring also to (2. 3) we see that


.m=l
this is precii;iely the posterior density of the shift point at its mth mass

point. Thus the gamma densities oc:;;c;urring in the posterior density of


2
w = 1 /r:r are mixed according ta the posterior density of the shift

index m.

Only One Of <Po Or cp 1 Known

The theory for the cqtse cp 0 unknown and cp 1 known parallels

that for the case <Pa known and cpl unknown, and for that reason we

shall study the latter case only in this sectLon. The results for the

former case will be evic;lent,

We now have an additional parameter, cp 1 , whose prior density

must be assigned. We shall derive the appropriate posterior densities

corresponding to two different vague prior ciistributions, Firstly we

shall assume that the direc;tion of the shUt is now known and assign the

improper prior density

This shall be referl;"ed to as the 11 unconstrained" prior density.

Next we shall assume that it is known that q, 0 < cp 1 , and in this case

we shall assign the "constrained" prior density


14

Unconstrained Prior, Inference About m

The joint posterior ~etis·~ty of the th:ree unknown parameters is

easily seen to be

The marginal posterior density of m is given by

where
-n
xm+l = ~ xi/(n-m), Making use of ~he Tonelli theorem
i=m+l
(2 1) we can write

[/
_
co 2 - n 2
exp{(-,(n-m)/2cr )(cp 1 -Xm+l).}dcp~dcr
2 l
00

er; (n-m)
-1/2 !co 2 ~(n+l)/2
(er )
2
exp{(-l/2cr ) [K(m, cp 0 )]}dcr
2
a
where
m 2 n -n 2
= Z::(X"-cpO) + 2: (X.-X +l)
i= 1 1 i= mt 1 1 m
15

Making use of the inverted gamma integral as before we obtain for the

posterior density of m

(2. 9)

A comparison with (2. 3) shows that, for each m, cp 1 has been

replaced by its estimate and an additional multiplier has been

introduced,

2
Unconstrained Prior, Inferenc:e About er

2
This may be based on the marginal posterior distribution of er
'
which ls given by

2
ir 1 (cr) o:i
n-1
~
/co 2 -(n/2+1)
(er)
m= 1 -co

{
exp (-l/2er) 2 [m
.~(Xi - cp 0 ) 2+. n
~ (Xi- 2J} dcp 1
<1> 1 )
i= 1 i=m-li 1

n-1 _ 112 2 -(n+l)/2 2


ex:: ~ (n-m) (er ) exp{(-1/2er ) K[m, cp 0 ]},
m=l
2
0 < er < co , (2. 10)

Again it is clear that the postel;'ior dens~ty of w = 1 I er 2 is a

mixture of gamma densities with parameters a


m
= (n-1)/2 and

[3m = 2/K(m, cp 0 ). As be{ore it is a straightforward matter to show

that the mixing density is precisely the marginaL posterior density of m.

Unconstrained Prior, Infe:i:-ence About p1

Inference about cp 1 can be based on the marginal posterior

density of cp 1 , which is given by


16

n-1
2:
Jai (o- 2 ) -(n/2+1)
m=l 0

exp { (-1/20-) m (Xi - cp ) 2 +. n


2 [ .2: 2: (Xi .. c!> 1 )zJ} dO" 2 .
0
1=1 1=m+l

Making use of the inverted gamma integral we obtain

n-l[m 2 n zJ-n/2
Trl(cpl)o: 2: 2:(X.-cp0) + 2: (X.-cpl)
m=l i=l 1 i=m+l 1

where T(m 1 cp 0 ) = (n -1 )(n-m) I K(m, cp 0 ), Thus we can write

where

gm (<j>l; n-1, f'm• Tm) : [ T ~ /Z r(n/2)J/r ((n-1 )/2)((n-1)7Tl /~


-«n-1)+1)
[1 + (Tm I (n - 1 ) ) ( cp 1 - µm) ~ 2

is the t density with location parameter µ , precision parameter


m
T , and n-1 degrees of freedom (See (22)). We see from (2. 11)
m
that the marginal postedor density of cp 1 is a mixture of t densities,

and it is easy to verify that the mixing density has the value

a t t. t s m th mass po1n
. t, m= 1, . , . , n-1 .
17

Constrained Prior, InferenGe About m

2
The joint posterior density of m, rr and <1> 1 is now

2
1T 1 ( m, CT ' cp 1 ) ex:

2
where me I , O<CT <1;D and cp 0 < <I> 1 < co , The marginal of m
n- 1
is thus

2
Integration with respec;:t to CT proc;eeds as before to give

1T 1(m)

where K(m, cp 0 ) and T{m, q, 0 ) are as defined in an earHe:r,- section.

This expression may now be wr'Ltten as

The integral is seen to be, apart from the norming constant, the
-n
upper tall of the t dens~ty with loc;ation parameter xm+ 1 • precision

parameter T(m, cp 0 ) and n-1 degrees of freedom. Inserting this

norming constant we may write


18

(2. 13)

where Tk b(x) is the cumulative distribution function of a t


' a,
random variable with location a 1 precision b, and k degrees of

freedom. Now the general t distribution referred to above may be

transformed to a Student's t by a translation and change of scale.

That is, if Y is such a t random variable, then Z = b 1 /Z[Y - a] has

a Student's t distribution with k degrees of freedom. This enables

us to write the cumulative distribution function used in (2. 13) in terms

of the distribution func;tion of a Student's t with n-1 degrees of

freedom, according to the formula

where ~k(x) is the distl;'ibution function of a Stud13nt's t random

variable with k degrees of freedom, We can then wrlte for the

posterior density of m

rrl(m) a: [n-mrl/Z[K(m,cj>O)r(n-1)/2 ~ -~n-l(T(m,cj>O)l/2(cj>o-X:i+1~]


(2. 14)

where m= I, .. , , n-1 . One advantage of this formula is its extreme

computational ease ..

2
Constrained Prior, Inference About CT

2
The marginal posterior density of CT is given by
19

2
Trl (o- ) 0:: 2:
n-1 Joo (o- 2 ) -(n/2+1)
m= 1 <Po

n-1 2 -(n/2+1) 2
ai 2: (o-) exp{(-1/20- )K(m 1 cp 0 )}
m=l

f cp
cxi 2 - n 2
exp{(-(n-m)/2 o- ) (cp 1 -Xm+l)} dcp 1 .
0

Inserting the norming constant for the normal density qn the

right and integrating with respect to qi 1 we mc:i.y write

2 n-1 2 -(n/2+1) { 2 }~ 2 11;2


Tr 1 (o- ) o:: m:l (o- ) exp ( ... l/2o- )K(m, cp 0 ) rTr o- /(n-m)j

~ ·N(t<f>o · x;;,+ll/(~/ /n·m ~]


where N(x) is the cumulative distribution function of a standard

normal variate, Sb;nplifying, we obtain

2 n-1 -1/2 2 -(n+l)/2 { 2 }


1T 1 ( o- ) ex: ~ (n - m) (o- ) exp ( - 1 I 2 o- ) K (m, cp 0 )
1

~ - N (( cj>0 - X~+1 ) J
(o-/ / n -m )) (
2 , 1$ )

2
for 0 < o- < cxi •

Constrained Prior, Inference About g, 1

From (2. 12) we see that the marginal posterior density of cp 1

is
20

Proceeding exactly as in the un<;onstrained case we obtain

(2. 16)

where K(m, cp 0 ) ~ T(m, cp 0 ) and g(x; k, a, b) are as previously

defined. The posterior distribution. in this case is thus a mixture of

truncated t distributions, In order to find the mixing density we need

to compute the norming constant K, ~ntegratin$ (2. 16) we get

Referdng again to (2. 16) we see that the vaLue of the mixing density at
. th . .
its rn mass poi,nt ls

(2. 17)

for m=l., .. , n-1. As before, W 1 (x) is the distribution function of


n~

Stµdent's t with n-1 degrees of freedom~


21

Neither cp 0 Nor <1> 1 :Known

·As in the previm,1s case we shall study the pre sent .situation for

two vague prior densities. Firstly we shall assume that nothing is

known about th.e order relation between cp 0 and cp 1 and as sign the
improper prior density

-co < cpi < co ' i = 0' 1 • (2. 18)

Next we shall assume it ii:! known that <l>o < cp 1 and C1.ssign the prior

-ai < <l>o < <1>1 < co (2. 19)

The theory for the qase in whl<;h the order relation on the

is reversed parallels that being presented here and will not be ·

presented separately. As before we shall use the terms ''unconstra'lned 11

prior and "constrained" prior for (2. 18) and (2. 19) respectively.

Unconstrained Prior, Inference About m

2
The joint posterior density of m CT ' is
'

2
where me I , 0 < er < co, and < "'·
'+'1 <
for i = 0, 1 . The
n- 1 -co 00

marginal posterior densHy of m is given by


22

Integrating first with respect to <T 2 we obtain

where

= ( ·~
i= 1
x.)fm
l.
and = (. ~
i=m+l
xJ)(n-m).

Let

m -m2 n -n 2
C(m) = . ~ (Xi - XI ) + ~ (Xi - Xm+ 1)
t= I i=m+ 1

Then we may write

where

(n -2 )m
0
-m C(m)
XI
! =
G: l. ~(m) =
-n
xm+l
'J' (m) =

0
(n-2)(n~m)
C(m)
23

The dou,ble integral above is, apart from the norming consta,nt,

that of a bivar·~ate t density and is easUy evalu,ated to give

[rr · r((n-2)/2)C(m)] I[r(n/2) (m(n-m))


1/2
] ,

Substitution in (2,21) gives

rr 1 (m) ex: [m(n .. m) ] -1/2[ C(m) ]"'(n .. 2)/2 ,

A comparison of (2. 3), (2, 9), and (2. 22) displays an interesting

pattern as more parameters are assumed Utlrknown.

Unconstrained Prior,
. .,,
Inference About ,.
o-~

The marginal poster'lor denslty of o- 2 is

2 n-ljc:o
1Tl (o- ) oc :E
2 -(n/2+1)
(o- ·)
Jco
m= l -co -co

exp {(-l/2q-) .:E (Xi-<Po) 2 +. :E.


2 [m n (X.i·<P 1 )2J} d<P 0 d<P 1
i=l i=m+l

n-1 2 -(n/?+l) { 2 }
a:: ~ (cr) exp (~l/2cr )C(m)
m=l

where <P and µ(m) are as before and


"' ,...,
24

Using the form of the bivariate normal density to evaluate the

double integral we obtain for the postertor density of cr 2

2
O<cr <co,
(2. 23)

As in the previous case considered, the posterior density of

w= 1 I cr 2 is a mixture of gamma densities, the mixing density being

the posterior density of the shift point.

Unconstrained Prior• Infe re nee About (cj> 0 , cp 1 )

The joint posterior density of <Po and cp 1 is

Making use of the notation introduced earlier we may write

It is now clear that we hav~ a mixture of bivariate t densities.

Letting h (cj>; n-2, µ(m), T(m)) denote the bivariate t density with
m"" ,.._,
n-2 degrees of freedom, location parameter !:(m). and precision

matrix T(m) we may write (2. 24) as


25

.. n .. z
n-1 1/2 -Z-
'IT1(cp0, cJ> 1 ) a; ~ [m(n-m)J"" [C(m)] h (cj>; n-2, µ(m), T(m)),
m ,...., "'
m= 1
- co < ,!,..
'l"l < co '. i ::: 1, 2. (2. 2 5 )_

Some straightforward algebra shows that in this case also the

mixing density for the bivariate t distributions is the posterior density

of the shift point m,

Constrained Prior, Inference About m

The joint posterior density for this last case to be considered is

2 . 2 - (n I 2 + 1 )
TI"l(m,o- ,cj>O,cj>l)c:c(o-)
{ 2 2 n
exp (-1/20-) .!:(Xi .. cj>O) + ~ (Xi-cj>l)
~m
2 J}
·=1 l=m+ 1
(2. 26)
2
where me In-l' 0 <a- <co, and -co < cp 0 < cp 1 < co, The marginal

posterior density of m is th1,ls

Using the general t density to integrate on cp 1 we obtain

'IT 1 (m) oc !
-co
co
(n-m)
_ 112 [m
.~
t= I
(X 1 -cp 0 )
2
+. ~
n
i::;mt 1
(X 1 -Xm+l)
~ n 2]-(n-1)/2
26

where ~ (x) is the distribution funetion of Student's t with n-1


n- 1

degrees of freedom and T(m, cj> 0 ) is defined by

Using a well known identity it is possible to write (2. 27) as

-n-2
irl(m) ex: [m(n-m)]l/2 [C(m)]---Z-[ooro [1 - i!rn-l(T(m, <l>o)l/2(<1>0-X~+1))]
, gm(c1> 0 ; n-Z,.xf1, w(m~d cl>o (2. 28)

whe+e w(m) = m(n-2)/C(m) and gm(cj> 0 ; k, a, b) is the general t

density defined earlier, We ean thus write (2. 28) as

-n-Z
ir I (m) m [m(n -m)r l /Z [C (m)) -Z E <l>o ~ - Vn- l (T(ttl, <i>ol l /Z ($0 - x;+ I~
(2. 29)

where m = 1, • , . , n-1 and Eel> is the expectation of the indicated


. 0
fun~tion of cl>o taken with respect to a general t density with n -2
-m
degrees of freedom, location pa:r:ameter Xl I and precision w(m),

Using the transformation

we know that y is clistrlbuted as Student's t with n-2 degrees of

freedom, This enables us to express the e~pectation in (2, 29) with

respect to a Student's t density. Straightforward computation shows

that this expectation then becomes


27

where

H(m, y) = [<n-1 )(n-Z)(n-m/(c(m)(/+ n-zVJ I /Z ~m(n-Z) r 1/Z y


C(m)

+ rxf1-x~+i ~
The expectation of the indicated func;:Hon of y is now taken with

re i;;pect to a Student's t distribution with n -2 degrees of freedom.

We can now write the marginal posterior density of m as

m=l,, .. ,n-1. (2.30)

It is clear that this formula pre sen ts considerable, though not

insurmountable, computational difficulty, and that a numerical integra-

tion technique of some kind would be needed to evaluate this density for

a given set of data.

Constrained Prior, Inference About o- 2

2
The marginal posterior density of o- for this case is

( 2)
irlcr a:~
n-1
" Joojoo (""2)-{n/2+1)
v

m= I -oo <Po

2 [m
exp {(-1/20-) .L: (Xi-cpO) 2 +. L:
n (Xi.- cpl) 2J} dcp 1 dcp 0
i=l i=m+l

a:
n-1
L:
2 -(n/2+1)/co
(er)
{ 2 [m 2 n _ n
exp (-l/2cr) L:(Xi-cpO) +. L: (XCXm+l)
~~
m=< 1 -a:i 1= 1 i=m+l

2) 1I 2
. (~~:;,
rL-N (:;~
x )il
<P -
Jd~o n
28

a::
m-l

n=l
1 2 2 -n/ 2
L: [m(n-m)] I (o-)
2
exp{(-1/20- )C(m)}Ecp [ 1 -N (
0
cl>o-X +l
-n
m
/n-m a-/
)~
(2 .. 3 1)

where N(x) is the distdbut·Lon funct·Lon of a standard normal variate

and the expected value of the indicated function of cp 0 is taken with

respect to a normal dens;ity with mean x:n and variance o- 2 /m.

Letting

we may take the expeetati9n with :respect to a standard normal and

write

2 n-l 1/2 2
rr 1 (o-) a:: L: [m(n-m)r exp{(-1/Zo- )C(m)}
m=l

(2.32)

where 0 < o- 2 < oo and the expectation of the indicated function of z is

taken with respect to a standard normal variable .


Constrained Prior, Inference About (cp0 , cpl)

The joint posterior density of cp 0 and ¢1 is

[m n
2 .L: (Xi-cpO) 2·+. L: (Xi-cpl)
exp { (-1/Zo-) zJ} do- 2
1=1 · 1=m+l
29

where -m < <Po < <1> 1 < oo. Evaluation of the integral and sLmp!ification

of the subsequent e:x;pre Ejston proceeds exactly as in the case of the

unconstra·~ned prior density, keeping in mind the a<;lded l'estriction on

<Po and <1> 1 , The joint posterior density of <Po and <1> 1 may then be
written as

n 1 ·
1T
1
(<!> 0 .<!> 1 ) o:: i: [m(n-m)r 112 [C(m)r<n-Z)/Zhm {cj>;n-2,
m=l ,.._,
µ(m),
,.._,
T(m)),

-co < <l>o < <I> 1 < m• (2. 3 3)

As before, h m.....,
(cj>; n ... 2, ......,
µ(m), T(m)) is the bivariate t density

with n-2 degrees of freedom, location parameter

J:t(m) : (X :U• x;,t 1) I I anq precrision matdX

(n-2)m
C(m)' 0
T(m) =
(n~2)(n~m)
0 . C{m)
CBAPTER III

POSTERIOR DISTRIS UT IONS RELATED TO

TWO .. PHASE REGRESSION FOR THE

DISCRETE CASE

As a generalization of the normal sequence studied in Chapter II

we shall in this chapter derive certain posterior distributions related

to making inferenc;es about a two-phase regression.

We assume that we have observations on a sequence Y 1 , , , , , YT'

T > 5, of independent random variables which follow two separate

linear regression regimes. As b~fore we shall introduce a discrete

random variable m for the unknown switch point, and we shall further

assume that the state space of m is the set IT~ 2 = {2, 3,.,., T-2}.

We thereby assume that we have at least two observations on each

regression, We thus have

2
Yi, i= 1, ... , m, independently distributed N(a 1+13 1 x 1, a- ) ,

and
2
Yj, j =m+l, , .. , T, independently distributed N(a 2 +13 2 Xj' a- ) ,

where a-~ > 0 X 1 < ... < XT are non ~stochastic regre s sor
'
variables, and m is the unknown switch point,

In sueh a situation, interest centers on estimating the switch

point m as weil as any unknown regression parameters and the


2
possibly unknown variance a- •
31

We shall study four cases in this chapter:

(i) Error variance known, first regression known

(ii) Error variance known, neither regression known

(Ui) Error variance unknown, first regression known

(iv) Error variance unknown, neither regression known.

Our interest shall center primarily on the shift point m, but in

some cases posterior distributions for the regression parameters also

will be derived. In a,ll cases we shall assume that prior knowledge is

such that independent, diffuse prior densities for the unknown para -

meters will be adequate.

Error Variance Known, First Regression Known

Without loss of generality we shall assume that <r 2 =1. a 2 , 13 2

and m are assumed a priori independent with prior densities

and

= { 1/(T-3), m=2, . , . , T-2


0
elsewhere .

The likelihood function is

L{m, q 2, 13 2 ) = (2rr)- T /2 exp { (-1/2) [ i~l


m [Y - (a +13 Xi)] 2
1 1 1

+ . i
1=m+l
[Yi - (a 2 + 132 Xi ]2J}

resulting in a joint posterior density of


32

J
expl(-1/2) [m
.~ [Yi-(a 1+13 1 Xi)] 2 + ·. ~
T [Yi-(a +13 Xi)]
2 2
~}
l i=l i=m+l
(3. 1)

for m = 2, . , . , T -2 , -co < a2 < co and -co < 13 2 < co ,

Inference About m

The marginal posne rior density of m is

m e IT _2 . (3. 2)

In order to evaluate this integral we shall use the identity

T [ ]2 T [ /\ m]2 m 1 -1 m
. ~ Yi- (a 2 +13 2 Xi) = ~ Y. - Y. + (a-µ ) ~ (a - µ ) (3. 3)
i=m+l i=m+l 1 L ,....., ,....., m"" ,.....,

where

/\ m
a2
m
!: =
/\ m
!32
-
T
~
-T. -T
(X.-X +l){Y.-Y +l)
,i=m+l L m + rn
IT i=m+l
Z: (X. - X
1
-T
m
+l)
2

T
(T -m) ~ X.
1
i=m+l
~ -1
m =
T T
~ X. ~ X.2
1 1
i=m+l i=m+l
33

= (.
1=m+l
Xi \ ~
Y"j T - m\ ,
') = (. , ~
1=m+l
Yi)\ A(T - m) ,
V\
and

Substituting identity (3. 3) in (3. 1) , the posterior density (3. 2)

may be written

,,. 1 (m) ex: exp { (-1/2) [ m~ [Y. - (a 1 + [3 1 X.)]


. 1 1 1
2+ T [Y. -Y,m] 2J}
.
~
+l 1
I\
1
i= i=m

/ CO/CO I
exp{(-1/2)(~ _J;:,m) ~~ (~ _l:m)} da.
1
- co -co

The integral may now be easily evaluated using the bivariate

normal density to give as the posterior qensity of m,

~(T -m) i=m


.
T
+l
-T
(X. - X +1)
m
~
2]-1/2
i

. exp { (-1/2) [ .~[Yi-


m (a
i=l
1+ [3 1Xi)) 2+.i=m+l
T [Yi- Y~) 2]}
~ 11 (3. 4)

for m = 2, .. , , T -2 ,

Inference About [3 2

The marginal posterior density of [3 2 is given by

T-2
J
co
'IT1(f32) ex: ~ 'ITl(m, a2, l3z)da2' -oo < 132 <co. (3. 5)
m=2 -co
34

To evaluate this integral we shall use the identity

Substituting (3. 6) in (3. 1) and integrating with respech to a 2 ,

(3, 5) becomes

~I (~2) :~:
oo (T-m)-1 /2 exp {(-1 /2) i~I [Yi - (a I+ ~I Xi))2}

exp {(-1 /2) i=l+ J(Yi - Y ~+!) ~2 x~+ 1~


- (Xi - 2}. (3. 7)

Substituting

T Am 2
+ ~ (Y. - Y. )
i=m+l 1 1

in (3. 7) we may write the marginal posterior density of 13 2 as

where

K (m) = ~ T -m).1=m+l
T -T
~ (Xi - Xm+ 1)
2]-1/2

exp { (-1/2) [ .~
m [Y - (a +13 Xi)]
1 1 1
2+. ~·[Yi-Yim]
T /\ 2J} •
1=1 1=m+l
35

=i
fI ~
i=m+l
(xi_ x~+ 1 )2
and g(y; m, v) is the normal densLty with mean m and variance v.

We thus obtain a mixture of normal densities, the mixing density being

precisely the posterior densi!ly of the shift point m.

Inferenc;e About a 2

The derivation of the marginal posterior density of a 2 proceeds

along the same lines as that of ~2 and results in the density

(3. 9)

where now

.l'\m
var (a 2 ) = ( T 2 )/~
. L: Xi
i=m+l
T - T
(T-m). L: (Xi-Xm+l)
i=m+l
2] .

Before proceeding to the next case we shall make an obsel,'vation

on the normal distributions involved in (3. 8) and (3. 9). For each m,

the mean and variance of the m th dens Hy in the mixture are the least

squares estimate of the pa:i,-ameter and its varianc::;e respectively, based

on the last T -m data points in the sequence.


36

Error Variance Known, Neither

Regression Known

2
Assuming as before that (J" =1 the likelihood function is now

-T /2
L(m, a 1' f31, a2, f32) = (2 7r)

exp{(-1/2)[~ . 1
1=
[Y.-(a 1 +[3 1 X.)] 2 +
1 1
1=mt l
.
~ [Y.-(a 2 +[3 2 X.)] 2
l 1
J}·
As signing indep~ndent, improper uniform prior densities to the

regression parameters and a discrete uniform prior to the switch

parameter m we obtain the joint posterior density

u 1 (m, a 1 , p1 , "z• p2 I oc exp{(-112{i~I [Yi - (a 1 + p1 xilJ2


+ .
i=m+l
~
[Yi- (a2 + [32 Xi)J 2 J}· (3. 10)

for m = 2, ... , T -2, -ro < a. <


l
ro , i = 1, 2 and -ro < P..
1-'l
< ro, i= 1,2.

Applying to the first m data points an identity analogous to (3. 3)

we may write

exp{(-1/2)[~ (Y,-~.m,L)2+ ~ (Y.-~.m,U)2 1 1 1 1


l=l i=m+l

+ (~ -i:m1' z;;;(~ -i:m~} (3.11)

where

m /\ m /\m /\ m Am 1

Qi =( Qi 1' f31 ' Qi 2' f32) 1 ' ~ = ( Qi 1 , f31 ' Qi 2 ' f32 )

/ \.m, L
Y = 0m + ~mX L\.m U
Yi '
Am
= a2 + f32
"'-m
x1 •
1 '-' 1 1-'l i '
37

m
m ~x
1 i

m m
~x. ~x.2
1 l 1 l

T
T -m ~ x.l

CD
m+l
T T
~ x.1 ~ X.2
l
m+l m+l

/\ m
\'/\m
Of course a2 and ~2 a:re as defined in the prevLous case

and and ~~ are ~heir counterparts for the first m data points.

The regression parameters may now be integrated out quite

easLly using the four variate normal integral to obtain as the posterior

density of m

exp(-1/2)
{ ~·mI:(Y.-~.m,L) 2+ T . /\
I: (Y.-Y.m'U) 2J} (3. 12)
. 1= l l
i=m +l l
. 1

for m = 21 ••• , T ::.2 .

A comparison of this result with (3, 4) shows that the known first

regression has been replac;:ed by its estimate for each m and. the

weighting factor has been altered ac;:cordingly. This densit·y will be

computed for a particular set of data in Chapter V.


38

Error Variance Unknown, First

Regression Known

The likelihood function for the present case is

2
To the new unknown parameter er we shall assign the improper

prior density

2
O<cr <co

while assigning to the remaining parameters the same prior densities

as before. Applying Bayes' s theorem we obtain the joint posterior

density

2 - (T I 2 +1 ) J 2 [ m 2
(er) expl(-l/2cr) i:: 1 [Yi-(a 1 +13 1 Xi)]

+
. +1
F:::m
i [Y.-(a 2 +13 2 x.)J 2
l l
j1j (3.13)

2
for .m;::: 2, . , . , T -2 , 0 < (J" < co ' -co < O!z < co and -co < l3z < co •

Integration on cr 2 can easily be performed using the inverted

gamma density to obtain as the joint posterior density of m, a2 and

Y.-(a 1+13 1 x.)


z+ T.
.:2:: [Y.-(a 2 +13 2 x.)]
zJ-T/2
1 l . 1 l
Fm+l (3.14)
39

for m=2, .,., T-2, -cxi < a2 < cp and -cxi < 13 2 < cxi. Using identity

(3. 3) and the same notation ai;; used in c~se one we can write

where

K(m)

We may now ·integrate out a 2 and 13 2 using the bivariate t

integral to obtain as the posterior deneity of m

(3. 16)

for m=2, ... , T-2.

Error Varianc;e Unknown, Neithe:i;-

Regression l<nown

The likelihood function for this the last case of this chapter is
40

Assigning the same prior distributions as in the previous cases

we obtain the joint posterior density

2
for m =2, .. , , T -2 , 0 < CT < co, -co a.
l
< co, -ro < P..
~l
< co , i = I, 2 .
2
Integration on CT proceeds as in case three to give for the joint

posterior density of m and the regress·~on parameters

Making use of the identities given in c;ase two we c<;1.n write, using

the notation of that section,

where the value of K(m) is now given by

m /\ L2 T " u2
K(m) = ~ (Y.-Y.m' ) + ~ (Y.-Y.m' )
i= 1 1 1 i= m+ 1 + 1

We may now integrate out the regression parameters using the

four va;riate t integral to obtain as the marginal posterior density of

m
41

m /\ 2 T /\ 2]-(T-4)/2
[ I: (Y.-Y.m,L) + I: (Y.-Y.m,U) (3. 17)
L L 1 1
i=l l=m+l

for m =2, , . , , T -2 .

In closing this chapter we point out the similarity between (3, 16)

and (3. 17), and remind the reader that examples of (3, 12) and (3.17)

will be preseneed in Chapter V.


C!-IAPTER IV

POSTERIOR DISTRIBU'l'JONS RELATED TO

TWO "PHASE REGRESSION FOR THE

CONTINUOUS CASE

In the previous chapter our intere at was centered on the index m

at which the switch from one regre13sion regime to another occurred.

In some cases interest may center more on the absciesa of the point of

intersection of the two regression lines. An easy calculaf!lon shows

that this is given by '{ = (a 2 - a 1 )/(13 1 -132 ) •


As in chapter three there are many cases one might consider.

Those discussed in this paper are

(i) Error variance known, first regression known, m known

(ii) Error varianqe known, first regression known, m unknown

(iii) Error variance known, neHher regression known, m known

(iv) Error variance known, neither regression known, m unknown

(v)-(viii) As above, but with the error variance unknown.

The approach taken will be to find the joint posterior density of

the unknown regression parameters under the assignment of diffuse

prior densities, and then to obtain from this the distribution of that

function of these parameters which g~ves the intersection. As

mentioned above, the function whose distribution we require is


43

Before proceeding to study the cases in turn we remark that the

cases corresponding to "second regress·ion kp.own" are completely

analogous to our "first regression known", and for that reason are not

being studied separately here.

2
er · Known, First Regression Known, m Known

The likelihood function for this case is

+
T
~ [Y.-(a 2 +(3 2 X.)]
.i=m +1 1 1
zl
, I
l
I,.
_!)

2
As before we are as sur;ning for convenience that er =1.
Combining this likelihood function with the same diffuse prior densities

used in Chapter III we obtain, in accordance with Bayes' s theorem

TT 1 (a 2 ,(3 2 ) ex: exp{(-1/2).


1=m+l
~
[Yi-(a 2 +(3 2 Xi)] 2 } (4. 1)

for -a:i < a 2 < co and -ro < 13 2 < a:i , Using identity (3. 3) of Chapter

III we can write

(4. 2)

where
T
(T-m) ~ X.
m+l 1
I
(/\m ~m) -1
!::.m = a2 1 t"'z ~
m
=
T T zI
~
m+l
X.
1
~
m+l
X.1 J
44

Am 6m .
Of course a2 and ~2 are the usual ieast squares estimates

of a 2 and ~2 based on the last T-m data points of the sequence.


We now transform to

The joint dens Hy of w1 and w2 is easily seen to be bivariate

normal with the mean vector relocated at

m 1',ffi 6m I
v
....., = ( Q:' 2 - Q:' 1 I ~2 - ~2 )

Finally we make the transformation to

and obtain as the joint density of 'Yi and 'Y 2

for -co < '\1.


I 1
< co, i = 1, 2 where
45

/ \.m, U
Y = ,;;; m + ~ m X
l ""2 1"2 i

Integration on 'Yz now yields the posterior density of the inter,.

sec;:tion

where

It may be shown that if X is a random variable notmally

distributed with mean µ and variance cr 2 then

2 -1/2 2 . 2
E[IXIJ = µ[2cp(µ/cr)- l] + (Ti/2cr) exp{-µ /2cr} (4, 4)

Using (4.4) to obtain E[lvlJ in (4.3) we obtain

(4. 5)

for -oo < -y 1 < oo.

2
er· Known, Fir~t Regression Known, m Unknown

Proceeding as ln the previous case and introducing a disc;rete

uniform prior density on m we obtain as the joint posterior density

of m, a 2 and 13 2

tr I (m, az, ~z) ~ exp{(-1 /Z{~l [Yi - +~I Xi)]z + i=l+/y i- (az+~z Xi)]~}
(al
(4. 6)

for m =2, , •. , T -2 , -ro < a 2 < co and -ro < 13 2 < ro .


46

Using identity (3. 3) we can wrUe (4, 6) as

where

and the remaining notation agrees with that introduced earlier. We

again transform first to w1 and w2 as before and then to

and obtain the joint den51lty

~ 1 (m, y 1, Yz) ~ h(m) IYz I exp {(l /2) A(m, y 1) B 2 (m, y 1)}
exp{(.J/Z)A(m,y 1 )(y 2 • B(m,y 1 )) 2}

where

h(m) = expJ(-l/2)[~[Y.-(al+f31X.)]2+
l i=l 1
~.T [~.m,U_(a1+f31X.)]2
i=m+l 1 1 l

+ ~..
i=mtl
[Y. -
1
~.m'
1
U fJ} . (4, 7)

Integrating with respect to 'iz and summing on m we obtq.in

the posterior distribution of the intersection


47

where -co < -y 1 < co . It is seen that the density in this case is a

mixture of densities of the type (4 1 5) with the mixing density being

given by equation (4. 7).

o- 2 Known, Neither Regression Known, m Known

In this case the joint posterior density of the four regression

parameters is easily seen to be

nI ( a I • ~I ' a 2' ~2 ) exp{ (-1/ Z t~l [


m y i - (a I + ~I Xi) ]2
+.
i=m+l
i [Yi - (a 2 + 132 Xi)J2J}

for -co< a 1 <co and -co< 131 <co, i = 1 1 2.


Applying identity (3. 3) and its counterpart fo+ the first m data

points we may write

(4. 9)

where

E("m)

I
m /\ffi ~m /\ID ;;:;,.m 1
a = (a I' 131' a 2' 132) !::, = {a I ' 131 ' a2 ' 132 )

!Jm L /\m 1-.\m


.):'. i ' = a I + 131 Xi '

and
48

m
m L: x.
1 1

m m
L: x. L: x2
1 1 i
1
L:-1
m
=
T
(T-m) L: X.
m+l 1

T T
L: x.1 L: x2
m+l m+l i

('@{11". ~r) and (0 ;i, ~;1) are the usual least squares estimates of
the regression parameters based on the fl:rst m and last T-m data

points respectively. We now transform to

wl = a 1 - a2

w2 = a2

W3 = '31 - !32

w4 = !32

Substitution in (4, 9) yields quadratic expressions in w2 and

w4 in the exponent and these may be integrated out using the form of

the normal density to obtain

(4. 10)
49

where

Finally we make the transformahion to

'{l = -w 1 /w 3 ·

and obtain as the joint d,ensity of '{ 1 and 'tz

~1(Y1•Y2l ~ IYzl exp{(-l/2)~(m,yl)yi- 2Q(m,y1lYz+ R(m,yl~}


(4. 11)

where

R(m, '{ l)

~i is the predil;!!ted Y at Xi when the regression parameter::; are

estimated by least squq,res over all data points,


50

To integrate out "¥ 2 we can complete the square in the .exponent

and WTite

~l (yl' Y2) ~I Y2 f exp{(-! /2) P(m, Y l i[Y2 - Q(m, Yl )/P(m, Y l ~ 2

+ R(m,y 1 ) - o 2 (m,y 1 )/P(m,y 1 )}

Provided that P(m, 'Y 1 ) is positive it is now straightforward to

use the normal density form to integrate with respect to -y 2 . Of

course P(m, -y 1 ) is a function not only of m and 'Yi bµt also of the

x 1' s from the data. The author has been unable to show ~hat P(m 1 'Y 1 )

is always positive, but is yet to en<;:ounter a situq.tLo'Q. where this

condition is not satisfied,

We shall thus assume that P(m,-y 1 ) ls positive for all 'Yi and

proceed to obtain

where

Applying formula (4. 4) we c;:an wr~te the posterlor dens Uy of the

inter sec;tion as

1Tl (-y 1) a: f J[ (
p -1/2 (m, 'Y1) {" (m, 'Yr )/P(m, 'Y1 ~ 2 cp Q(m, 'Y1) p -1/2. (m, 'Y1 )) - ~ J~
expt-l/2)Q 2 (m,y 1 )/P(m,y 1 ~ + bP(m,y 1 )/~ -l/ 2 },

~co < 'Y 1 < a:i • (4. 12)


51

2
er Known, Neither Regression

Known, m Unknown

The derivation of the posterior distribution of the Lntersection for

this case proceeds exactly as in c;ase three, the additional step being

summation on the range of the shift index m, One obtains

T ... 2 I
1T1h'1) ex: L: p~l 2(m,-yl)
m=2

{ ~(rn, y l )/P(rn, y l ~ [z <i>(Q(m, y l) p-1 /2(m, y l )) • ~


exp~! /2) 0 2 (m, y )/P(m, y
1 1 ~ + tP(m, y 1 )/~ .J/Z}
-co < '{ 1 < CX'.l • (4. 13)

We remind the reader that ~or this case the c;ondition discussed

iq case three must be sat~sfied for each m over the range of the

summation.

2
er Unknown, First Regression

Known, m Known

The likelihood function for the pre sent case is

2
We assign to er the usual improper prior density

2 2 2
1T 0 (er) ex:: l/er, O<er <co,
and retain i:r:pproper uniform prior densities for the :regres13ion para-

meters. The joint posterior 0.ens4ty is then

Tr 1 (cr
2 2 -(T/2+1)
,a 2 ,13 2 )q: (er)
{ 2
exp(-1/2cr)
[m1 1
~ [Yi ... (a 1 +13 1 X)J
2

+.
L.:;mfl
~.. (Yi - (a2 + !32 Xl)]zl}
J (4. 14)

2
for 0 < CT < co , .,.co < a2 < co, and -i;o < 132 < co. In.tegrat'lng fir st

with respect to o- 2 we obtain

(4, 15)

We now derive from (4. 15) the jolp.t density of w 1 and w2 ,

where

and obtain

where

E(m)

and
53

m
m ~x.
l
1
-1
I: .. .::;
m
m m
I: x1 I: x.'?l .
1 1

';['he final transformation is made to

This leads us to the joint density

where

and

We are again faced w~th a problem similar to that discussed in

the third case considered in this ehapter, As surning that G(m, '\' 1 ) is

positive for eaGh 'Yi we can write (4. 17) as


54

and make use of the general t density (22) to integrate with respect

to 'Yz. Although this has not; been proved in general it has been

checked for a number of data sets and was not violated in those cases

checked. We shalL therefore assume that G(m, 'I 1 ) is positive for

i;iach 'I 1 and obtain for the marginal di;insity of 'I 1

where

Tedious but straightforward computation shows that Lf t 1 /Z (X - u)

is distributed like Student's t with n degrees of fret:ldom, then

+ µ ( 2 vn (µ t l /2 ) - 1) , (4.19)

where B(x, y) is the beta funotlon and vn (x) is the cumulative

distribution function of a Student's t random variable with n degrees

of freedom. Applying this result to (4, 18) we may write the density

of the interse~Hon as
55

rrl('(l) cc G-(T-1)/2(m,'(l)A-l/2(m,'(1)
(

{ 2 r(m, y 1 )/G(m, y 1 ~ -l / 2 [(T-2) B((T-1)/2, l/~J-l


I, 2 l-(T-2)/2 (4.20)
L+A(m, y 1 )B (m, y 1 )/G(m, y 1 )J
+ B(m, y 1 ) ~VT-! (B(m, y 1 ) ((T-1) A(m, y 1 )/G(m, y l )) 112)- ~}
for -co < y 1 < co .

2
CJ Unknown, First Regression

Known, m Unknown

The joint posterior density of m, 2 a2 and 13 2 is now


(j '

2 2 -(T/2+1) { 2rm .2
TTl (m, (j 'av 132) cc (CJ ) exp ( .. 1 /2 (j )L~1 [Yi - (a 1+131 Xi)]

+.
1=m+ 1
~
[Yi-(q2tj32Xi)]21}.
J
(4.21)
2
for m =2, ,,. 1 T -2 , O < CJ < co , -co < a 2 < co , and -co < 13 2 < co .

Del'ivation of the posterior density of the intersection proceeds

exactly as in the previous case with the additional step of summation

over the range of the shift index m 1 We thus obtain a mixture of

densities of the type shown in (4. 20), namely

T...,.,-2 G-(T-1) I 2( )A-1 I 2(· )


TTl ("{ 1) cc kJ m, '( 1 m, Y1
m=2
56

r, 2 l·(t.,z)n
~ + A(m, 'i 1 ) B (m, 'i 1 ) I G(m, "'{ l )j
(4. 22 >

+ B(m, y I {2 *-:r-l ( B(m, y 1 )((T-1 I A(m, y 1 )/G(m, y I )) 1 /Z)- ~}


for -co < "'{ 1 < co .

cr 2 Unknown, Neithlllr Regresi;lon

Known, m Known

The joint posterior density of the varianc:e and the regression

parameters is

for 0 < 0" 2 < co . .co < a.L < co ap.d -co < A. < co , i = 1, 2 .
~l

Integrating first on <rz we obtain

(4. 23)

The notation corresponds to that used ln case three, and is

e;xplalned immediately below equation (4. 9). We ne;xt make the trans -

formatLon ho
57

wl = Q!l - Q! 2

W2 = Ql2

w3 = !31 - !32

w4 = !32

as in ~ase three, integrate out w2 and w4 using the general t

density shown in De Groot (22), and then obtain the j oin!l dlstr·ibution of

and

We obtain

where

P(m, 'Ii)

T /\ 2
= l: (Y. ~ Y.)
1= I l i
58

Eq\lation (4. 24) c:;;an be written

where

We are again faced with the problem of the sign of G(m,'Y 1 ), If

G(m, "Yi) is posit~ve we can write

and integrat"lon with respec:t to 'Yz proc;:e~ds eai;Uy using the general

t density referred to earlier, Although we hav~ been unable to prove

that G(m, 'Yi) is a poeitive function, 0£ m anQ. 'Y 1 we shall assume

that this coadition is i;;atisfied and proceed. One thE)n obtains for the

posterior density of 'Y 1

(4, 26)

for .-co <'Yi< co.


59

2
er Unknown, Neither Regression

Known., m Unknown.

The joint posterior density of the unknown parameters in this the

final case is

2 2.-(T/2+1) { 2[m 2
,,. 1 (m,cr , a 1 , [3 1, a 2 , [3 2 ) ex: (cr ) exp (-l/2cr ) i!}Y 1 - (a 1 + [3 1 Xi)]

+. ~
L=.mtl
~
[Y. (a2 + f32
~
X)Jil}.
J
(4,27)

The posterior density of the intetsec:t·Lon is a mi:,ctu.re of densities

of the same type as (4. 26), Its derivation proceeds as in the previous

qase with the adqit·Lona,l step being summation on the shift incl.ex m.

The density ·Ls

{2E'(m,y 1 )/G(m,y 1 l 1 /Z [T-4)B((T-3)/2, l/2)] ·l

~ + Qz(m, y I)/ (P(m, y I) G(m, y I ))J(T-4)/2 +[Q(m, y I )/P(m, YJ ~

~ VT-3 ((T-3)1 /2Q(m, Y1 )/(P(m, Y1) G(m, YJ))l/2)- ~} (4, 28)

for -oo < 'Yi < oo ~

An illustration of densities (4, 22) ;:i.nd (4. 28) will be presented

in the following chapter.


CHAPTER V

INFERENCE PROCEDURES AND

SOME EXAMPLES

In Chapters II, III and IV we determined the postel;'ior distribu ..

Hons, under a variety of assumptions, for an unknown parameter, say

e. The parameter a was a scalar in most cases, although sometimes

e was a vec::tor as we saw in Chapter II with the posterior distribution

of (<!> 0 , <1> 1 ), equation (Z~ 25).

We shall in this chapter describe some ways in which posterior

dietributions are used to make inference, and illustrate some of these

techniques on postedor distributions selected from Chapters II, III and

IV.

All of the computations neces1:1ary to present the examples in this

chapter we re done with programs wriHen by the author ar:i.d run on the

IBM ~ 360, Mod 65 c;omputer at the Oldahoma State Un;lversity

Computer Center,

Plot 0£ The Posterior Density

In cases where the qirpension of the parameter vector is at most

two, a plot of the posterior density (in the univariate case) or contours

of constant posterior density (ln the b·Lvariate case) <;:an be useful.


61

ExamI? le 5. 1

Hinkley ( 13) used some data from an article by Pool and

Borchgrevink (23) to illustrate the techniques which he suggested. In

order to compare our tec;hniques with his 1 we shall use the same set of

data, which is given in Table I below, The independent variable X

represents the logarithm of warfarin c;oneent+ation and the dependent

variable Y i,s blood factor VII production. rrk(m) and rru(m) are

explained after the table,

TABLE I

DATA AND RESULTING POSTERIOR DENSITIES

m
.x m y
m rrk(m) TT
u
(m)

1 2.00000 0.370483
2 2,52288 0. 537970 0,000057 0.002121
3 3.00000 0.607684 0,005115 0. 011831
4 3.52288 0.723323 0.031579 0.034446
5 4.00000 0,761856 0.297597 0.281910
6 4.52288 0.892063 0.276329 0.266278
7 5,00000 0.956707 0,351680 0.365501
8 5,52288 0. 940549 0,037518 0.055949
9 6.00000 0.898609 0.000117 0,001142
10 6.52288 0.953850 0.000006 0.000376
11 7.00000 0.990834 0.000002 0.000322
12 7,52288 0.890291 0.000000 0.000049
13 8.00000 0.990779 0.000000 0,000075
14 8,52288 1. 050865
15 9.00000 0.982785
62

A plot of the data given in Table ~ suggel!!ts that two-phase

regression may be an approp11late analyeis.

We have computed the poi;terior density of the shUt index m for

two cases, narp.ely varianc;e known (equation (3. 12)) and variance

unkn0wn (equatlon (3. 17)). Both cases ass\,lme that neither regression

is known. Our resulte for equatlon. (3. 12) are denoted in Table I above

by 1Tk(m), while those for eql,lation (3. 17) are denoted by 1Tu(m). In

computing 1Tk(m) we have taken as the "known" variance the estimate


/\2
obtained by Hinkley (13), namely er :::: • 00166.

The posterior density 'IT (m) ls pl9tted in Fig\lre 1 below,


u

'IT (m)
u

.3

.2

' 1

m
2 3 4 5 6 7 8 9 10 11 12 13

Figure 1. Plot of Foste rior Density 'IT (m)


u

The density 'lTk(m) is not plotted s inc;;e it very neallly c;ioincides

with 'IT (m) .


u
63

There is strong evidence, using either posterior density, that

the switch in regime$ o~curs at m = 5, m = 6, or m = 7. We

remark in passing that Hinkley's 99% confidence interval on the inter-

sec;tion of the regression lines is (3, 641, 5. 441), which contains x5 ,


x6 , and x7 .

Point Estimates For Parameters

As a point estimate of a parameter a one may use the mean,

median or mode of the posterior distribution. Each estimate can be

"justified'' in at least one way, depend~ng ln sorne cases on the

particular lass fun~Hon as signed, These e stlmates are disc\ls sed in

many modern inference texts and the reader is referred to Ferguson

(24) for a survey of this subject. We shall present here the numerical

values of the three point estimates for our data of Example 5, 1:

;r (m)
u
Mode of Posterior Distribution 7.00000
Median of Posterior Distribution: 6.00000 6.00000
Mean of Posherior Distribution 6.05077 6.04998

The variance of the posterior distribution is sometLmes given

along with measures of central tend.ency to help describe the distribu-

tion, For our example,

Var k(m) = 0. 97123


and

Var (m)
u
= 1. 09669 .
64

]:legions Of Highest Poster~o:r Penisity

Let 1T(0; y) denote the posterior density function for a

paramete:r e. A region R in the parameter space of e is called a

H-igheE!t Posteriol" Density (H. P, D,) region of content (1 - a) if

(i) pr (e E R ; y) = 1 - a ' and

It is immediately clear thci.t if a is a dlscr~te random var table

then an H. P. D. region wUL not exist for all values of ~ . In fact, if

0 has, say, n distinct mass points 0,£ positive density, then a. P, D.


regions exist f9r at most n distinct values of a , .Ifo;r a more complete

discussion of H. P, D. regions the reader 'Ls referred to a paper by Box

and Tiao (2$).

For the data of Example 5. 1, the set

is an H.P. D, region for m, based on irk(m) 1 of c;ontent O. 925606,

while Hs content based on ir (m) is O. 913689.


u
H.P. D. regions for symmetrical, unimodal distributions are

easily obtained by numeriGal integration. A further iLlustration will be

given later in this chapter in E::x:ample 5, 2,

Hypothesis Testing

Suppose we have observed data y from an experiment and w"Lsh

to judge whether or not the data support some specified hypothesis, say

H: 0 E s0 , about a parameter 0 on which the assumed distribution of


65

the observation vector depends, Of couri:;e we must have s0 a subset


of S, the entire par9-rne~E;lr spac;e of 8. We shall discuss briefly

three possible approac;he s to this problem.

First Approach

If one wishes to accept or reject H at some predetermined

significance level, say a , then

(1) Construct a (1 - a)· 100% H. P 1 D. region R for 0 .

(ii) Reject H if and only if S0 n R is empty,

This implies that the posterior probability of H ls at most a

when H is rejected.

Second Approach

Compute the posterior odds ratio

or alternatively just ~he posterior prob&bility of s0 , namely

r;-·~ = 1S rr 1 (8; y) d8 .
0

is normed int he sense that Clearly large


>:<:
values of either or r1 lend credence to H, This procedure

seems to be innappropria.te if rr 1 (8; y) is continuous and s0 is a

single point of S. A possible alternative in such a situation is given in

the third approach.


66

Third Approach

This is based on the ratio of ordin.atelil of the poste:r;ior density,

The ratio is defined by

He re again 0 ~ r2 ~ 1 and value a of r2 near one lend

credence to H, This appr0ach appears to overcome the difficulty

posed by r1 and *
r1 when H is a slrnple hypothesis of the form

H: e = e0 •
For the data of our Example 5, 1, we compute and

based on 1Tu(m), for a teat of the hypothesis H: me s0 , where

s 0 = {5 1 6, 7}. Here the parameter space is S = {~ •. ,., 13}. We

have
rl = 10.586

*=
rl 0.914

r2 ::; l. 000

Other Techniques

There is a variety of other ~i::ehnique s available for applying the

posterior density function to parHcular statisti<;al problems,

Where e is a vec:tor, marginal posterior diatributfons of certain

parameters may be of interest. In fact, the net result in most sections

of this report is a rnarg·Lnal posterior distrU:>ution of some kind.

AnotheJ' tool used in certa~n applications ls !!he predictive density, used

to make inferences about data which ie to be observed at some future

time. Corresponding to H.P. D. region:;; one may define "prediction

regions", or regions of "highest predi<::t\ve densil!y".


67

For a more complete survey of uses and applications of the

posterior probability density function, the reader is referreQ. to texts

by Zellner (26), Lindley (19), La Valle (27) and De Groot (22),

An Ad Hoc Technique

All of the distributions derived in Chapter IV are not based on

any prior constraints on the point of intersection of the two population

regression lines.

In many cases it may be known that the intersect·~on, namely

'I = (a 1 - a 2 )/(~ 2 - ~ 1 ), is constrained such that x m<- ''\/ <- x m +1,


where m is the shift index. If tMs is too restrictive, then

X1 ~ 'I ~ XT would not seem unreasonable for most practical

purposes. That is, we require that the regression lines intersect at

some point over the observed range of the independent variable X.

When this latter assumption can be made, thls author suggests

truncating the appropriate distribution over the interval [X 1 , XT]

rather than working with it over the entire real line.

We shall now illustrate this technique on some of the distributions

derived in Chapter IV, using the data of Example 5. 1. Hinkley (13)

gave confidence intervals for the cases "second regression known"

(with slope equal to zero) and "neither regreS1sion known". We shall

follow his lead in Examples 5, 2 and 5, 3 respectively.

Example 5. 2 (Second Regression Known)

We assume here that ~ = 0 and thq.t 0:'2 = 0. 961674. This


2
value of is the average of the last nine Y.'s in the sequence,
1

since the median of ou.r density 1T (m) was six and the mean was near
u
68

si:x:. It also agrees to four decimal places with the e $timate from

Hinkley's paper, In Figure 2 below we show a plot of the posterLor

density (4. 22) for the case "second regression known". The density

has been truncated on the interval [X 1, XT]. Since the ordinates of

the density are relatively small outside of the intervC;Ll [3, 6), the

density is shown only over that interval.

3 4 5 6

Figure 2. Posterior Density (4. 22): Second Regression Known

,
,

From this density we ca~c;ulatecl the mode as well a$ H. P. D.

regions of content O. 90, 0. 95 and 0. 99. For comparative purposes

we now present the results from our posterior deni;ity as well as those

from Hinkley' s paper!


69

M. L. E. /Mode 90% Interval 95% Interval 99% Interval

Hinkley 4.88 (4. 55, 5, 29) (4. 45, 5. 39) (4. 25, 5, 66)

Density (4. 22) 4.89 (4.54,5.31) (4,46,5.43) (4.28,5,79)

Example 5. 3 (Neither Regression Known)

For the same dat;;i. set as we have used Ln the previous examples

we plot in Figure 3 below the posterior density (4. 28) truncated on

[X 1 , XT] . Again it is plotted only over the interval [3, 6] .

ilj ( l()
2

3 4 5 6

Figure 3. Posterior Density (4. 28) : Neither Regression Known

A comparison of our H.P. D. regions from density (4. 28) with

the confidence intervals given by Hinkley is again presented:


70

M. L. E. /Mode 90% Interva.l 95% Interval 99% Interval

Hinkley 4.65 (4. 16, 5, 10) (4,09,5.20) (3,64,5,44)

Density (4. 28) 4.60 (4, 04, 5. 17) (3.85,5.33) (3.32,5,78)

Some General Comments

The author feels that the H, P. D, regions obtained from densities

(4. 22) and (4, 28) compare quite favorably with the confidence intervals

given by Hinkley, In making such a comparison, one must keep in mind

that Hinkley 1 s confidence intervals are b~sed on the re strlction that

X m < 'Y <


~
x.m + 1 , whereas the H.P. D, regions der!ved from our

posterior distributLon are subjec;t to nq such restrictions,

A further consideration is that the confLdence intervals derived

by Hinkley are based on the asymptotic chi-square distributions of the

likelihood ratio statistics, with the result that the approximation for

small to moderate sample sizes is somewhat questionable. On the

other hand, the posterior distributions given in this report are exact,

allowing computation of H, P. D, regions to any de.sired degree of

accurac:;y. We must caution, though, that we have not established

whether or not the posterior distributions derived in Chapter IV are

unimodal. A bimodal distribution may lead to an H.P. D. region which

is not a single interval. Hinkley ( 13) also points out that the confidenGe

interval algorithm suggested by him does not necessarily lead to a

single interval.

With regard to computational c;onsiderations, the procedures

given here present little problem. The discrete distributions given in

Chapter HI for the shift point obviously present no difficulty, while


71

computation of the distributions of the intersection given in Chapter IV

require only referral to the standard normal distribution function

(variance known) and Student's t distribution function (variance

unknown).

We close this chapter with a comment on the inqompleteness of

the present study. A complete analysis of a two~phase :regression

situation would of course include estimates not only of the shift point

and intersection, b\lt also of the l'egressi,.on parameters and the error

variance. We have directed our attention almost exclusively to the

switch point and the intersection,


CHAPTER VI

SUMMARY AND POSSIBLE EXTENSIONS

Summary Of The Study

This research was undertaken with the intention of making some

contribution to the general problem of estimating the time point at

which a pa;rameter change occurs in an ob~rnrved sequence of random

variables. No attempt has been made to study the related problem of

detecting whether or not a parameter shift has occurred in the

sequence,

A Bayesian approach was employed for each of the cases studied,

and vague type prior densities were assigned resulting in posterior

distributions appropriate to situations where prior knowledge is

imprecise 1 We remark here that even if prior knowledge does not fit

this description, it may still be informative to look at an analysis

under the assignment of diffuse prior densities.

In Chapter II we directed attention to the special case of a normal

sequence with unknown varianee, and derived posterior densities

corresponding to a variety of assumptions on the parameters of the

problem.

Chapters III and IV focused on the more general setting known

as two~phase regression. In particular, in Chapter III we derived

posterior densities for the shift index itself, while in Chapter IV we


73

studied the problem of e$timating the abscissa of the point of inter-

section of the two regression lines,

In Chapter V we surveyed some of the possible uses to which the

posterior distributions may be put, and gave examples based on a set

of data used by Hinkley ( 13) to illustrq.te his solu,tion to the same

general problem.

Some Possible Extensions

Little attempt has been made in this study to investigate the

properties of the posterior distributions derived in Chapters II, III and

IV. One of the re as ans for this is the apparent algebraic complexity of

the form of the distributions. This point Ls well illui;;trated by

equation (4. 28), for example. A considerable amount of numerical

work remains to be done with regard to :means, modes and variances

of the distributions,

Again, we have not directed any attention to the problem of

estimating ~he error variance in the two-phase regression setting. Its

posterior distribution could, hopefully, be computed, and the

estimators resulting from 1t could be compared to those given qy

Hinkley ( 13 ). The same ·is true for the regression parameters of the

two regressions under s~udy.

Another obvious and natural extension would be to the multi-

variate case, where at eaeh t·Lme point one obtains a vector of

observations rather than a single observation. In this case, of course,

the cpi' s of Chapter II would be replaced by a vector of means, while

() 2 would be replaced by a variance -covariance or dispersion matrix.


74

Finally we point out that the study by Hinkley ( 13) assumes that

'{, the abscissa of the intersection point of the two regression lines, is

constrained by Xm ::_ '{ < Xm+l, where m ·is the (usually unknown)

switch point of the sequence. The author has investigated this situation
2
in the Bayesian f:i,-amework for the first c;aee, namely where m, (J'

and the first regression are all known, The resulting posterior

distribution for '{ is simply our distrlbuion (4. 5) truncated on the

interval [X
.m
,X + ).
.m 1
One could proceed to study some of the more

complex cases under this added re stricHon to i:;ee how the resulting

posterior distributions compare to those arrived a~ in this dissertation.


A SELECTED BIBLIOGRAPHY

(1) Page, E. S. "Continuous Inspection Schemes. 11 Biometrika,


Vol. 41, 1954, 100-115.

(2) Page, E, S. 11 A Test For A Change In A Parameter Occurring At


An Unknown Time Point, 11 Biometrika, Vol, 42, 1955, 523-
527.

(3) l?age, E, S, 11 0n Problems In WhLch A Change In A Parameter


Occurs At An Unknown Time Point, 11 Biometrika, Vol. 44,
1957, 248-252.

(4) Quandt, R. E. 11 The Estimation Of The Parameters Of A Linear


Regression System Obeying Two Separate Regimes. 11 Jou:t;"nal
of the American Statistic;:al Association, Vol. 53, 1958,
873 -880r

(5) Quandt, R. E. 11 Tests Of The Hypo the sls That A Linear Regre s -
sion Obeys Two Separate Regimes. 11 Journal of the American
Statistical Association, Vol. 5 5, 1960, 324 ,..330, -

(6) Sprent, P, 11 Some Hypo theses Concerning Two -Phase Regression


Lines. 11 Biometrics, Vol, 17, 1961, 634-645,

(7) Chernoff, H. and Zacks, S. 11 Estimating The Current Mean Of A


Normal Distribution Wh'Lch Is Subjected To Changes In Time, 11
Annals of Mathematical Statistics, Vol. 35, 1966, 999-1018,

(8) Kander, Z. and Zacks, S. 11 Test Procedurea For Possible


Changes In Parameters Of Statistical Distributions Occuring
At Unknown Time Points. 11 Aqnala of Mathematical
Statistics, Vol. 37, 1966 1 1196-1210,"

(9) Bhattacharyya, G. K. and Johnson, R, A, 11 Non Parametric Tests


For Shift At An Unknown Time Point, 11 Annals of Mathemati -
cal Statistics, Vol. 39, 1968, 1731-1743,

( 10) Brown, R. L. and Durbini J. 11 Methods Of Investigating Whether


A Regre::ision Relationship Is Constant Over Time, 11
Selected Statistical Papers L Amsterdam: Mathematisch
Centrum, .European Meeting, 1968, 37-45.

(11) Tukey, J, 11 The Futu:t;"e Of Data Analysis. 11 Annals of Mathemati-


cal Statistics, Vol. 33, 1962, 1-67.

7i:;
76

·:.:;,,,... (12) Hinkley, D. V. "InferenGe About The Intersection In Two-Phase


Regression." Biomet;rika, Vol. 56, 1969, 495-504.

(13) Hinkley,__ D, V. "Inference In Two-Phase Regression. 11 Journal


of the American Statistical Association, Vol. 66, 1971,
736 -743.

(14) Hinkley, D. V. "Inference About The Change Point In A Sequence


0£ Random Variables. 11 Biometrika, Vol. 57, 1970, 1-17.

(15) Hinkley, D. V. and Hinkley, E. A. "Inference About The Change


Point In A Sequence Of Binomial Random Variables. 11
Biometrika, Vol. 57, 477-488.

( 16) Broemeling, L. D. "Some Bayesian Inferences About A Changing


Sequence Of Random Variables. 11 Submitted to Utilitas
Mathematica, 1972.

( 17) Broemeling, L. D. "Bayesian Procedures For Detecting A Chq.nge


In A Sequence Of Random Variablei;;. 11 Submitted to Metron,
1972.

( 18) Quandt, R. E. 11 New Approach To Estlmating Switching


Regressions. 11 Journal of the American Statistical Associa-
tion, Vol. 67, 1972, 306°':310.

(19) Lindley, D. V, Introduction~ Probability And Statistics From


!=_ Bayesian Viewpoint, Cambridge University Press,
Cambridge, 1965, · ·

(20) Jeffreys, H. Theory Of Probability, Clarendon Press, Oxford,


1961.

(21) Royden, H. L. ~ Analysis 1 The Macmillan Company, New


York, 1968.

(22) De Groot, M. H, Optimal Statistical Decisions, McGraw-Hill


Book Company, New York, 1970,

(23) Pool, J. and Borchgrevink, C. F, "Comparison Of Rat Liver


Re spouse To Coumar in Administered In Vivo Versus In
Vitro. 11 American Journal of Physiology, Vol. 206, 1964,
229-238,

(24) Ferguson, Thomai:; S. Mathematical Stq.tistiGs - A Decision


Theoretic Approach, Academic Press, New York, 1967.

(25) Box, G. E. P. and Tiao, G. C. "Multi-parameter Problems


From A Bayesian Point Of View, 11 Annals of Mathematical
Statistics, Vol. 36, 1965 1 1468-1482. ·
77

(26) Zellner, A. An Introduction To Bayesian Inference In


Econometrics, John Wiley and Sons, New York;l.971.

(2 7) La Valle, L H. An Introduction To Probability, Decision And


Inference, Hold, Rinehart, and Winston, New York, 1970.
VITA

Donald Holbert

Candidate for the Degree of

Doctor of Philosophy

Thesis: A BAYESIAN ANALYSIS OF SHIFTING SEQUENCES WITH


APPLICATIONS TO TWO-PHASE REGRESSION

Major Field: Statistics

· Biograph·ical:

Personal Data: Born in Tea Gardens, New South Wale.s;,


Australia, August 27, 1941, the son of Mr. and Mrs,
Frederick George Holbert.

Education: Received secondary school education at Maitland


High School, Ma'ltla:nd, N. S, W,, Australia; graduated
from Newcastle Teachers' College, Newcastle, N. S. W.,
Austrq.lia, ·in 1960; received Bachelor of Science degree in
mathematics from the University of Oregon, Eugene,
O'regon, in May, 1967; received Master of Arts degree in
mathematics from Washington State University, Pullman,
Washington, in M:ay, 1969; completed requirements for
the Deg:ree of Doctor of Philosophy at Oklahoma State
University in.May, 1973, ·

Professional Experience:. Mathematics Instructor at Cronulla


High School, Cronulla, N~ S, W,, Australia, 1961-1962;
Mathematics Instructor at Farrer· Memorial Agricultural
High School, Tamworth, N. S. W., Australta., 1963-1965;
graduate teaching assistant at Washington State Univer.sity,
Pullman, Washington, 1967 -1969; graduate research
assistant at Oklahoma State University, Stillwater, Okla-
homa, 1969-1973; member of the American Statistical
Association and the Biometric Society.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy