0% found this document useful (0 votes)
2 views10 pages

Poisson Data Models

The document discusses various models for analyzing Poisson data, focusing on examples such as chronic medical problems in urban and rural areas, leukemia cases across seasons, and the impact of exposure variables on the expected number of events. It emphasizes the importance of understanding the relationship between observed counts and underlying risk factors, as well as the implications of proportionality in modeling. The document also outlines how to construct models using log-link functions and the interpretation of parameters in these contexts.

Uploaded by

Yulya Sha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views10 pages

Poisson Data Models

The document discusses various models for analyzing Poisson data, focusing on examples such as chronic medical problems in urban and rural areas, leukemia cases across seasons, and the impact of exposure variables on the expected number of events. It emphasizes the importance of understanding the relationship between observed counts and underlying risk factors, as well as the implications of proportionality in modeling. The document also outlines how to construct models using log-link functions and the interpretation of parameters in these contexts.

Uploaded by

Yulya Sha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Models for Poisson data

V. Vasdekis

Athens University of Economics and Business

April 8, 2021

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 1 / 10
Example 1
Let us assume that we possess the number of chronic medical
problems in a sample of areas which are approximately of the same
size. Areas are of urban or rural character. The total number of
observations is n = 49.
Scientific question: Do we expect that urban and rural areas present
the same mean number of chronic medical problems?
These are count data. Numbers from urban areas 0, 1, 1, 0, 2, 3....
Numbers from rural areas 2, 0, 3, 0, 0....
We assume that yi ∼ P(λi ), i = 1, . . . , 49 independent observations.
The model consists of the linear predictor which represents the
scientific question and the link function
{
1 if i obs comes from rural
log(λi ) = β0 + β1 regioni , regioni =
0 otherwise

Parameters interpretation: exp(β0 ) = E(y|urban),


exp(β1 ) = E(y|rural)/E(y|urban). . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 2 / 10
Example 2

Number of new leukemia cases in an area for 12 consecutive months.


Wrong assumption: data are independent.
Question: Does the expected number of new cases differ between 4
seasons?
Random component: yi ∼ P(λi ), i = 1, . . . , 12 independent
observations (assumption is wrong in practise).
Model is:

log(λi ) = β0 + β1 season1 + β2 season2 + β3 season3

where season1 , season2 , season3 are pseudovariables for spring,


summer and autumn respectively.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 3 / 10
Parameters interpretation

Therefore,
exp(β0 ) = E(y|winter)
All other parameters express ratios of expected values as a
comparison of all other seasons with the winter.

exp(β1 ) = E(y|spring)/E(y|winter)

exp(β2 ) = E(y|summer)/E(y|winter)
exp(β3 ) = E(y|autumn)/E(y|winter)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 4 / 10
Example 3
It is frequent that the expected number of cases depends on an
exposure at risk variable, the effect of which must be taken into
account if we wish to make different population groups more
comparable.
As an example, consider the number of epileptic seizures being
measured on a number of patients. One patient is measured for 2
weeks, another one is measured for 1.5 weeks. Can we compare the
expected number of epileptic seizures between the two patients?
Let us denote by λ the expected number of cases when the exposure
at risk variable is not taken into account and λ′ when this variable is
taken into account. Let us also denote by s the exposure at risk
variable.
Parameter λ expresses what we actually measure. Parameter λ′
expresses what we would have measured provided we have measured
under the same exposure conditions.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 5 / 10
Assumption: λ′ = λ/s, giving that λ = s × λ′ . Therefore, this
assumption says that if we double the exposure at risk variable we
expect to double what we see, the expected number of cases.
This assumption is called proportionality property of the exposure at
risk variable effecting the dependent variable.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 6 / 10
An example

Suppose that in the same problem, we measure an explanatory


variable x. Then, we must model λ′ since this parameter is
comparable between subpopulations defined by explanatory variables.
Remember however, that data are based on λ. Since λ′ = λ/s.
The random component is defined as

yi ∼ P(λi ), i = 1, . . . , n or yi ∼ P(si × λ′i )

The linear predictor and log-link gives

log(λi ) = log(si ) + log(λ′i ) = log(si ) + β0 + β1 x

Note that log(si ) is an explanatory variable with coefficient equal to


1. Such a variable is called an offset.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 7 / 10
Example 4

Number of accidents in a specific time period in two Cambridge roads, in


three different time zones. An estimated traffic volume is also measured as
an exposure at risk variable.
Estimated
Time of day Accidents traffic volume
Trumpington Road 07.00-09.30 11 2206
Trumpington Road 09.30-15.00 9 3276
Trumpington Road 15.00-18.30 4 1999
Mill Road 07.00-09.30 4 1399
Mill Road 09.30-15.00 20 2276
Mill Road 15.00-18.30 4 1417

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 8 / 10
Modeling
Let us denote by yi , i = 1, . . . , 6 the number of accidents. We assume
these are independent observations. If si is the estimated traffic
volume of observation i, then a possible assumption about the effect
of si to λ′i is
λi
λ′i =
si
We can also write yi ∼ P(si × λ′i ).
Therefore, we have assumed proportionality of the estimated traffic
volume and expected number of accidents.
We model now λ′ using, say, the road effect and the final model
emerges

log(λi ) = log(si ) + log(λ′i ) = log(si ) + β0 + β1 roadi

where roadi is an indicator variable (pseudovariable) for Mill road.


We can use other models and check their applicability. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 9 / 10
What if no proportionality?

Remember that since λ′ = λ/s, therefore, λ = s × λ′ and


log(λi ) = log(si ) + log(λ′i ). If proportionality does not hold, a possible
model can be λ = sγ × λ′ and therefore log(λi ) = γ × log(si ) + log(λ′i ).
What are the consequences of such a model? If we double the value
of the exposure at risk variable then the new λ

λnew = (2s)γ × λ′ = 2γ sγ λ′ = 2γ × λ

therefore the expected number of cases is multiplied not by 2 but by


2γ .

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
V. Vasdekis (Athens University of Economics and Business)
Models for Poisson data April 8, 2021 10 / 10

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy