Lecture 10 Spring 2017
Lecture 10 Spring 2017
1/15
Counts data
I ······
2/15
Example: tortoise species data
The Galapagos Islands off the coast of Ecuador are great
locations for studying the factors that influence the development
and survival of different life species. The data set provides
counts for the total number of tortoise species, and the number
of species that occur only on that one island (the endemics)
(Johnson and Raven, 1973).
3/15
Example: tortoise species data
4/15
Poisson distribution for counts data
Yi ∼ Poisson(µi ),
6/15
Link function
log(µi ) = XiT β,
µi = exp(XiT β).
7/15
Maximum likelihood estimator
8/15
Score function and hessian matrix
9/15
Asymptotic normality of β̂
β̂ − β ∼ N(0, (X T VX )−1 ).
10/15
Deviance
I The log-likehood for µi in a saturated model is
n
X
`(µi ) = {Yi log(Yi ) − Yi } + Const..
i=1
I The log-likelihood for µi is the full model with
µi = exp(XiT β) is
n
X
`(β) = {Yi log(µ̂i ) − µ̂i } + Const..
i=1
12/15
Over or under dispersion
E(Yi ) = Var(Yi ) = µi .
Note that the mean and variance are the same. This might
not be flexible in practice.
13/15
Quasi-likelihood
14/15
Estimation of dispersion parameter
15/15