0% found this document useful (0 votes)
18 views59 pages

Poisson Regression 1730136731

Uploaded by

romandre91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views59 pages

Poisson Regression 1730136731

Uploaded by

romandre91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Poisson Regression for Regression of Counts and Rates

Edps/Psych/Soc 589

Carolyn J. Anderson

Department of Educational Psychology

© Board of Trustees, University of Illinois


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Outline
GLMs for count data.
Poisson regression for counts.
Poisson regression for rates.
Inference and model checking.
Wald, Likelihood ratio, & Score test.
Checking Poisson regression.
Residuals.
Confidence intervals for fitted values (means).
Overdispersion.
Fitting GLMS (a little technical).
Newton-Raphson algorithm/Fisher scoring.
Statistic inference & the Likelihood function.
“Deviance”.
Summary
C.J. Anderson (Illinois) Poisson Regression 2.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

GLMs for count data


Situation: response/outcome variable Y is a count.
Generalized linear models for counts have as it’s random component
Poisson Distribution.
Examples:
Number of cargo ships damaged by waves (classic example given by
McCullagh & Nelder, 1989).
Number of deaths due to AIDs in Australia per quarter (3 month
periods) from January 1983 – June 1986.
Number of violent incidents exhibited over a 6 month period by
patients who had been treated in the ER of a psychiatric hospital
(Gardner, Mulvey, & Shaw, 1995).
Daily homicide counts in California (Grogger, 1990).
Foundings of day care centers in Toronto (Baum & Oliver, 1992).
Political party switching among members of the US House of
Representatives (King, 1988).
C.J. Anderson (Illinois) Poisson Regression 3.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

More Examples. . .
Number of presidential appointments to the Supreme Court (King,
1987).
Number of children in a classroom that a child lists as being their
friend (unlimited nomination procedure, sociometric data).
Number of hard disk failures at uiuc during a year.
Number of deaths due to SARs (Yu, Chan & Fung, 2006).
Number of arrests resulting from 911 calls.
Number of orders of protection issued.
In some of these examples, we should consider “exposure” to the event.
i.e., “t”.
e.g., hard disk failures: In this case, “exposure” could be the number of
hours of operation. Rather than model the number of failures (i.e., counts),
we would want to measure and model the failure “rate”

Y /t = rate
C.J. Anderson (Illinois) Poisson Regression 4.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Poisson regression for counts

Response Variable is a count


Explanatory Variable(s):
If they are categorical (i.e., you have a contingency table with counts
in the cells), convention is to call them “Log-linear models”.
If they are numerical/continuous, convention is to call them “Poisson
Regression”

First, Y = count, and then Y /t rate data.

C.J. Anderson (Illinois) Poisson Regression 5.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Components of GLM for Counts


Random component: Poisson distribution and model the expected
value of Y , denoted by E(Y ) = µ.
Systematic component: For now, just 1 explanatory variable x (later,
we’ll go over an example with more than 1).
Link: We could use
Identity link, which gives us µ = α + βx
Problem: a linear model can yield µ < 0, while the possible values for
µ ≥ 0.
Log link (much more common) log(µ), which is the “natural
parameter” of Poisson distribution, and the log link is the “canonical
link” for GLMs with Poisson distribution.
The Poisson regression model for counts (with a log link) is
log(µ) = α + βx
This is often referred to as “Poisson loglinear model”.
C.J. Anderson (Illinois) Poisson Regression 6.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

The Poisson log-linear model

log(µ) = α + βx
Since the log of the expected value of Y is a linear function of explanatory
variable(s), and the expected value of Y is a multiplicative function of x:

µ = exp(α + βx)
= eα eβx

What does this mean for µ?


How do we interpret β?

C.J. Anderson (Illinois) Poisson Regression 7.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Interpretation of β

log(µ) = α + βx
Consider 2 values of x (x1 & x2 ) such that the difference between them
equals 1. For example, x1 = 10 and x2 = 11:

x2 = x1 + 1

The expected value of µ when x = 10 is

µ1 = eα eβx1 = eα eβ(10)

The expected value of µ when x = x2 = 11 is

µ2 = eα eβx2
= eα eβ(x1 +1)
= eα eβx1 eβ
= eα eβ(10) eβ

A change in x has a multiplicative effect on the mean of Y .


C.J. Anderson (Illinois) Poisson Regression 8.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Interpretation of β (continued)
When we look at a 1 unit increase in the explanatory variable (i.e.,
x2 − x1 = 1), we have

µ1 = eα eβx1 and µ2 = eα eβx1 eβ

If β = 0, then e0 = 1 and
µ1 = e α .
µ2 = e α .
µ = E(Y ) is not related to x.
If β > 0, then eβ > 1 and
µ1 = eα eβx1
µ2 = eα eβx2 = eα eβx1 eβ = µ1 eβ
µ2 is eβ times larger than µ1 .
If β < 0, then 0 ≤ eβ < 1
µ1 = eα eβx1 .
µ2 = eα eβx2 = eα eβx1 eβ = µ1 eβ .
µ2 is eβ times smaller than µ1 .
C.J. Anderson (Illinois) Poisson Regression 9.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Example: Number of Deaths Due to AIDs


Whyte, et al 1987 (Dobson, 1990) reported the number of deaths due to
AIDS in Australia per 3 month period from January 1983 – June 1986.

yi = number of deaths
xi = time point (quarter)
xi y i xi y i
1 0 8 18
2 1 9 23
3 2 10 31
4 3 11 20
5 1 12 25
6 4 13 37
7 9 14 45

C.J. Anderson (Illinois) Poisson Regression 10.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Data: Number of Deaths Due to AIDs × Month

C.J. Anderson (Illinois) Poisson Regression 11.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

A Linear Model for AIDs Data


Let’s try a linear model:
µi = α + βxi
The estimated parameters from GLM with a Poisson distribution and the
identity link:
µ̂i = −6.7355 + 2.4287xi
In SAS OUTPUT, there’s strange things such as
Standard errors for estimated parameters equal to 0.
Some 0’s in the OBSTATS.
From SAS LOG file. . .
WARNING: The specified model did not converge.
ERROR: The mean parameter is either invalid or at a limit of its range for
some observations.
C.J. Anderson (Illinois) Poisson Regression 12.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

A Linear Model for AIDs Data

R is even worse:
poi0 ← glm(count ∼ month, data=aids, family=poisson(link=”identity”))

Error: no valid set of coefficients has been found: please supply starting
values

C.J. Anderson (Illinois) Poisson Regression 13.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

A Look at the Bad Model (linear link)

C.J. Anderson (Illinois) Poisson Regression 14.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Back to Data but Plot log(yi) by Month

(line is linear regression line)

C.J. Anderson (Illinois) Poisson Regression 15.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Poisson Log-Linear Model for Deaths

Figure suggests a log link might work better:

log(µ̂i ) = .3396 + .2565xi

µ̂i when Link is µ̂i when Link is


xi yi Log Identity xi yi Log Identity
1 0 1.82 −4.21 8 18 10.93 12.69
2 1 2.35 −1.88 9 23 14.13 15.12
3 2 3.03 0.55 10 31 18.26 17.55
4 3 3.92 2.98 11 20 23.60 19.98
5 1 5.06 5.41 12 25 30.51 22.41
6 4 6.56 7.84 13 37 39.43 24.84
7 9 8.46 10.27 14 45 50.96 27.27

. . . and it looks like it fits much better.


C.J. Anderson (Illinois) Poisson Regression 16.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Figure of Fitted log(count) from Log-linear

C.J. Anderson (Illinois) Poisson Regression 17.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Figure of Fitted log(count) from Log-linear

Pattern in residuals.

C.J. Anderson (Illinois) Poisson Regression 18.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Transform explanatory variable


The number of deaths with low & high values of xi are “over-fit” and
number with middle xi ’s are under-fit.
Transform xi −→ x∗i = log(xi )

C.J. Anderson (Illinois) Poisson Regression 19.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Poisson Regression with Transformed x


The estimated GLM with model
Random: Y follows Poisson distribution.
Systematic: α + β log(xi ) = α + βx∗i
Link: Log −→ log(µ).
As a log-linear model
log(µ̂i ) = −1.9442 + 2.1748x∗i
or equivalently, as a multiplicative model
µ̂i = e−1.9442 e2.1748xi

Interpretation: For a 1 unit increase in log(month), the estimated count


increases by a factor of e2.1748 = 8.80
Is this “large”?
C.J. Anderson (Illinois) Poisson Regression 20.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

How Large is Large in a Statistical Sense?

SAS/GENMOD provides asymptotic standard errors (ASE, i.e., large


sample) for the parameter estimates.
The ASE for β̂ equals .2151, and an approximate 95% confidence interval

β̂ ± 2(.2151) −→ (1.745, 2.605)


which suggests that this is large in a statistical sense.
Or in terms of scale of the data,

(exp(1.745), exp(2.605)) −→ (5.726, 13.531)

C.J. Anderson (Illinois) Poisson Regression 21.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Observed and Fitted Log(Counts)

C.J. Anderson (Illinois) Poisson Regression 22.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Observed and Fitted Counts

C.J. Anderson (Illinois) Poisson Regression 23.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Comparison of Fitted Counts


log(xi ) xi xi
xi yi Log Log Identity
1 0 .14 1.82 −4.21
2 1 .65 2.35 −1.88
3 2 1.56 3.03 0.55
4 3 2.92 3.92 2.98
5 1 4.74 5.06 5.41
6 4 7.05 6.56 7.84
7 9 9.86 8.46 10.27
8 18 13.17 10.93 12.69
9 23 17.02 14.13 15.12
10 31 21.40 18.26 17.55
11 20 26.33 23.60 19.98
12 25 31.82 30.51 22.41
13 37 37.87 39.43 24.84
14 45 44.49 50.96 27.27
C.J. Anderson (Illinois) Poisson Regression 24.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Comparison in Log-Scale

C.J. Anderson (Illinois) Poisson Regression 25.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Observed and Fitted Counts

C.J. Anderson (Illinois) Poisson Regression 26.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

More Interpretation of Poisson Regression


The marginal effect of xi (month period) on µi (expected number of
deaths due to AIDS).
For a 1 unit increase in log(month), the estimated count increases
by a factor of e2.1748 = 8.80.
Computed fitted values and compared them to the observed. (table
and plots of this).
Additional one: We can look at the predicted probability of number of
deaths given value on xi . (This is not too useful here, but would be of
use in a predictive setting).
Counts follow a Poisson distribution, so
e−µi µyi
P (Yi = y) =
y!
According to our estimated model, probabilities that the number of deaths
equals yi for particular value(s) of xi is
(−1.9442+2.1748x∗
i) ∗ y
e−e e(−1.9442+2.1748xi )
P (Yi = y) =
y!
C.J. Anderson (Illinois) Poisson Regression 27.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Probabilities of Number of Deaths


(−1.9442+2.1748x∗
i) ∗ y
e−e e(−1.9442+2.1748xi )
P (Yi = y) =
y!
or since we already have µ̂i computed, we can use

e−µ̂i µ̂yi
P (Yi = y) =
y!

For example, consider quarter = 3 (and log(3) = 1.09861), we have

µ̂(quarter = 3) = 1.5606

P (Y3 = 0) = e−1.5606 (1.5606)0 /0! = .210


P (Y3 = 1) = e−1.5606 (1.5606)1 /1! = .328
P (Y3 = 2) = e−1.5606 (1.5606)2 /2! = .128
..
.
P (Y3 = 10) = e−1.5606 (1.5606)10 /10! = .000000253

C.J. Anderson (Illinois) Poisson Regression 28.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Example 2: Crab Data


Agresti (1996)’s horseshoe crab data.
Response variable is the number of satellites a female horseshoe crab
has (i.e., how many males are attached to her).
Explanatory variable is the width of the female’s back.

C.J. Anderson (Illinois) Poisson Regression 29.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

A Smoother Look
The data were collapsed into 8 groups by their width (i.e., ≤ 23.25,
23.25–24.25, 24.25–25.25. . . , > 29.25).

C.J. Anderson (Illinois) Poisson Regression 30.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Estimated Poisson Regression for Crabs


log(µ̂i ) = −3.3048 + .1640xi

Estimated ASE of β̂ = .164 equals .020 (small relative to β̂).


Since β̂ > 0, the wider the female crab, the greater the expected number of
satellites. Note: exp(.1640) = 1.18.
There is an outlier (with respect to the explanatory variable).
Question: how much does this outlier effect the fit of the model?
Answer: Remove it and re-estimate the model.
log(µ̂i ) = −3.4610 + .1700xi
and ASE of β̂ = .1700 equals .0216.
In this case, it doesn’t have much effect. . . The same basic result holds
(i.e., positive effect of width on number of satellites, β̂ is “significant”
and similar in value).
C.J. Anderson (Illinois) Poisson Regression 31.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Poisson Regression with Identity Link

From the figure of collapsed data, it looks like either a linear or a log
link might work.
The estimated model with the linear link :

µ̂i = −11.53 + .55xi

Since the effect on the number of expected satellites of female width


(µi ) is linear and β̂ = .55 > 0, as width increases by 1 cm, the
expected count increases by .55.
Question: Is the Poisson regression model with the linear or the logit
link better for these data?
Answer: Quick look but more formal later when we discuss model
assessment (or read further in the text).
C.J. Anderson (Illinois) Poisson Regression 32.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Log versus Identity Link for Crabs

C.J. Anderson (Illinois) Poisson Regression 33.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

SAS
data crab; input color spine width satell weight;
datalines;
color spine width satell weight
3 3 28.3 8 3050
4 3 22.5 0 1550
2 1 26.0 9 2300
..
.
run;

title ’Poisson regression model fit to individual level data’;


proc genmod data=grpcrab;
model satell = width /link=log dist=poisson obstats;
output out=preds pred=phat lower=lci upper=uci;
run;
C.J. Anderson (Illinois) Poisson Regression 34.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

R: Poisson regression

crab data.txt
color spine width satell weight
3 3 28.3 8 3050
4 3 22.5 0 1550
2 1 26.0 9 2300
4 3 24.8 0 2100
.
.
.

crabs ← read.table("crab data.txt",header=TRUE)

mod.poi1 ← glm(satell ∼ weigh, data=crabs,


family = poisson(link="log"))

C.J. Anderson (Illinois) Poisson Regression 35.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Poisson regression for rates


Events occur over time (or space), and the length of time (or amount of
space) can vary from observation to observation. Our model should take
this into account.
Example: Gardner, Mulvey, & Shaw (1995), Psychological Bulletin, 118,
392–404.
Y = Number of violent incidents exhibited over a 6 month
period by patients who had been treated in the ER of a
psychiatric hospital.
During the 6 months period of the study, the individuals were primarily
residing in the community. The number of violent acts depends on the
opportunity to commit them; that is, the number of days out of the 6
month period in which a patient is in the community (as opposed to being
locked up in a jail or hospital).
C.J. Anderson (Illinois) Poisson Regression 36.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Distribution of Violent Incident Data

C.J. Anderson (Illinois) Poisson Regression 37.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Poisson Regression for Rates of Events


Y = count (e.g., number violent acts).
t = index of the time or space (e.g., days in the community).
The sample rate of occurrence is Y /t.
The expected value of the rate is
1
E(Y /t) = E(Y ) = µ/t
t
The Poisson log-linear regression model for the expected rate of the
occurrence of events is
log(µ/t) = α + βx
log(µ) − log(t) = α + βx
log(µ) = α + βx + log(t)

The term “− log(t)” is an adjustment term and each individual may have a
different value of t.
− log(t) is referred to as an “offset”.
C.J. Anderson (Illinois) Poisson Regression 38.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

As a Multiplicative Model

The Poisson log-linear regression model with a log link for rate data is

log(µ/t) = α + βx
µ/t = eα eβx
µ = teα eβx

The expected value of counts depends on both t and x, both of which are
observations (i.e., neither is a parameter of the model).

C.J. Anderson (Illinois) Poisson Regression 39.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Gardner, Mulvey, & Shaw (1995)


Response variable is rate of violent incidents, which equals the number
of violent incident divided by the number of days an individual resided
in the community. (ȳ = 3.0 with s = 7.3 and t̄ = 154 with s = 42
days).
Explanatory variables:
Age (x̄1 = 28.6 years and s1 = 11.1)
Sum of 2 ER clinicians ratings of concern on a 0 – 5 scale, so x2 ranges
from 0 to 10. (x̄2 = 2.9 with s2 = 3.1).
History of previous violent acts, where
x3 = 0 means no previous acts
= 1 previous act either 3 days before or
more than 3 days before
= 2 previous acts both 3 days before and
more than 3 days before

r(concern, history)= .55,


r(age,history)= −.11,
r(age,concern)= −.07
C.J. Anderson (Illinois) Poisson Regression 40.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Estimated Parameters

Coefficient Value ASE value/ASE


Intercept -3.410 .0690 -49.29
Age -.045 .0023 -19.69
Concern .083 .0075 11.20
History .420 .0380 11.26

Note: Poisson regression models for rate data are related to models for
“survival times”.

C.J. Anderson (Illinois) Poisson Regression 41.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model fit to Violent Incident Data

C.J. Anderson (Illinois) Poisson Regression 42.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Example 2 for Rates: Lung Cancer

Data are from Lindsey (1995) from Andersen (1977)


Response Variable: Y = Number of cases of lung cancer and it
follows a Poission distribution.:
Explanatory Variables:
City in Denmark (Fredericia, Horsens, Kolding, Vejle).
Age (40–54, 55–59, 60–64, 65–69, 70–74, >75).

Offset = Population size of each age group of each city.

We will model the rate of cases of lung cancer = Y /t.

C.J. Anderson (Illinois) Poisson Regression 43.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Plot of the Rate by Age

C.J. Anderson (Illinois) Poisson Regression 44.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Plot of the log(Rate) by Age

C.J. Anderson (Illinois) Poisson Regression 45.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model 1: Age and City both Nominal


Define 
1 if city is Frederica
Fredericia =
0 other city

1 if city is Horsens
Horsens =
0 other city

1 if city is Kolding
Kolding =
0 other city
Define Dummy variables for the 6 age classes (groups).
Model 1:
log(Y /pop) = α + β1 (Fredericia) + β2 (Horsens) + β3 (Kolding)
= β4 (Age1) + β5 (Age2) + β6 (Age3) + β7 (Age4)
β8 (Age5)
C.J. Anderson (Illinois) Poisson Regression 46.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Parameter Estimates from Model 1

Parameter Estimate df s.e. X2 p


Intercept α 1 −4.48 0.21 423.33 < .01
city Frederic β1 1 0.27 0.18 2.10 .15
city Horsens β2 1 −0.05 0.19 0.09 .76
city Kolding β3 1 −0.09 0.19 0.25 .62
city Vejle 0 0.00 0.00 . .
age 40-54 β4 1 −1.41 0.25 32.18 < .01
age 55-59 β5 1 −0.31 0.25 1.60 .21
age 60-64 β6 1 0.09 0.23 0.18 .67
age 65-69 β7 1 0.34 0.23 2.22 .14
age 70-74 β8 1 0.43 0.23 3.34 .07
age >75 0 0.00 0.00 . .
2
Note: G = 23.45, df = 15, p = .08

C.J. Anderson (Illinois) Poisson Regression 47.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model 2: City Nominal & Age Numerical


The mid-point of the age ranges were used (except for the last one, I used
75).

log(Y /pop) = α + β1 (Fredericia) + β2 (Horsens) + β3 (Kolding)


= β4 (Age Mid-point)

Parameter Estimate df s.e. X2 p


Intercept α 1 −8.22 0.44 349.18 < .01
city Frederic β1 1 0.24 0.18 1.72 0.19
city Horsens β2 1 −0.05 0.19 0.10 0.76
city Kolding β3 1 −0.10 0.19 0.28 0.60
city Vejle 0 0.00 0.00 . .
age-midpoint β4 1 0.05 0.00 75.62 < .01
Note: G2 = 46.45, df = 19, p < .01
C.J. Anderson (Illinois) Poisson Regression 48.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model 2: Observed and Fitted Values

C.J. Anderson (Illinois) Poisson Regression 49.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model 3: City Nominal & Age Quadratic

log(Y /pop) = α + β1 (Fredericia) + β2 (Horsens) + β3 (Kolding)


= β4 (Age Mid-point) + β5 (Age Mid-point)2

Parameter Estimate df s.e. X2 p


Intercept α 1 −21.72 3.09 49.24 < .01
city Frederic β1 1 0.27 0.18 2.13 0.14
city Horsens β2 1 −0.05 0.19 0.09 0.76
city Kolding β3 1 −0.10 0.19 0.26 0.61
city Vejle 0 0.00 0.00 . .
age-midpoint β4 1 0.50 0.10 24.91 < .01
age2 β5 1 −0.00 0.00 19.90 < .01

G2 = 26.02, df = 18, p = .10.

C.J. Anderson (Illinois) Poisson Regression 50.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model 3: Observed and Fitted Values

C.J. Anderson (Illinois) Poisson Regression 51.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model 4: Simpler city & Age Quadratic

Define

1 if city is Frederica
Fredericia =
0 other city

log(Y /pop) = α + β1 (Fredericia) + β2 (Age Mid-point) + β3 (Age Mid-point)2

That is,

α + β1 + β2 (Age) + β3 (Age)2

if Fredericia
log(Y /pop) =
α + β2 (Age) + β3 (Age)2 if other city

C.J. Anderson (Illinois) Poisson Regression 52.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model 4: Simpler city & Age Quadratic

Parameter Estimate df s.e. X2 p


Intercept α 1 −21.78 3.09 49.61 < .01
frederic 1 β1 1 0.32 0.14 4.92 .03
frederic 0 0 0.00 0.00 . .
age-midpoint β2 1 0.50 0.10 24.93 < .01
age2 β3 1 −0.00 0.00 19.91 < .01

Note: G2 = 26.2815, df = 20, p = .16.

C.J. Anderson (Illinois) Poisson Regression 53.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Model 4: Fitted and Observed

C.J. Anderson (Illinois) Poisson Regression 54.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

SAS: Input data


data lcancer;
input age $ 1-5 age midpt city $ cases population;
lpop = log(population);
rate = cases/population;
lograte = log(rate);
age sq = age midpt*age midpt;
frederic=0;
if city=’Frederic’ then frederic=1;
datalines;
40-54 47 Fredericia 11 3059
55-59 57 Fredericia 11 800
60-64 62 Fredericia 11 710
..
.
C.J. Anderson (Illinois) Poisson Regression 55.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

SAS: Fit models


Nominal Predictors:
title1 ’Poission loglinear Model for Rates’;
title2 ’cases = city age’;
proc genmod data=lcancer order=data;
class city age;
model cases = city age / link=log dist=poisson offset=lpop type3;
run;
Numerical and Nominal:
title1 ’Poission loglinear Model for Rates’;
proc genmod data=lcancer order=data;
class city ;
model cases = city age midpt / link=log dist=poisson offset=lpop
type3;
run;
C.J. Anderson (Illinois) Poisson Regression 56.1/ 59
Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

R: Data

Data Set: lung cancer data.txt


age age midpt city cases population
40-54 47 Fredericia 11 3059
55-59 57 Fredericia 11 800
60-64 62 Fredericia 11 710
.
.
.

lc ← read.table(“lung cancer data.txt”,header=TRUE)

C.J. Anderson (Illinois) Poisson Regression 57.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

R: Data

All nominal predictors


model1 ← glm(cases ∼ offset(log(population)) + city + age,
data=lc, family=poisson)
summary(model1)

Nominal and numerical:


model2 ← glm(cases ∼ offset(log(population)) + city + age midpt,
data=lc, family=poisson)
summary(model2)

C.J. Anderson (Illinois) Poisson Regression 58.1/ 59


Outline Poisson regression for counts Crab data SAS/R Poisson regression for rates Lung cancer SAS/R

Next Steps

Statistical Inference for Poisson Regression. . .

C.J. Anderson (Illinois) Poisson Regression 59.1/ 59

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy