Eped L Note, 1997
Eped L Note, 1997
FACULTY OF MEDICINE
ADDIS ABABA UNIVERSTY
COMH 603
Epidemiology I
Principles of Epidemiology
(3 credits)
LECTURE NOTE
By
YEMANE BERHANE
1997
TABLE OF CONTENTS (PAGE)
I. INTRODUCTION TO EPIDEMIOLOGY.....................................................................2
II. COMMUNICABLE DISEASE EPIDEMIOLOGY.....................................................2
III. TYPES OF EPIDEMIOLOGIC STUDIES................................................................2
IV. MEASUREMENT IN EPIDEMIOLOGY...................................................................2
V. EPIDEMIOLOGICAL DESIGN STRATEGIES.........................................................2
VI. EVALUATION OF EVIDENCE.................................................................................2
VII. PRESENTATION OF EPIDEMIOLOGIC...............................................................2
INFORMATION.................................................................................................................2
VIII. OUTBREAK INVESTIGATION AND.....................................................................2
MANAGEMENT................................................................................................................2
IX. EPIDEMIOLOGIC SURVEILLANCE.......................................................................2
X. SCREENING..................................................................................................................2
XI. ETHICS OF EPIDEMIOLOGIC RESEARCH..........................................................2
DCH/AAU: Epidemiology Note 1
I. INTRODUCTION TO EPIDEMIOLOGY
A. Definition
The definition emphasizes that epidemiologists are concerned with the collective health
of the people in communities; unlike the clinicians who are concerned with the health of
the individual.
C. Scope/use of epidemiology
Epidemiology has been used in several ways in the planning and evaluation of health
intervention in an effort to improving the health status of the population. Four uses are
mentioned here:
E. History of Epidemiology
Although epidemiologic thinking has been traced to the time of Hippocrates, the
discipline did not flourish until the 1940s. Some key dates and contributions to the
development of epidemiologic thinking and methods include:
1
DCH/AAU: Epidemiology Note 2
1662- John Graunt published Natural and Political Observations on the Bills of
Mortality. He was the first to quantify patterns of birth, death and disease
occurrence, noting male- female disparities, high infant mortality, urban-rural
differences, and seasonal variations.
1839- William Farr took responsibility for medical statistics in the Office of the
registrar General for England and Wales. He extended the epidemiologic analysis
of morbidity and mortality data, looking at effects of marital status, occupation,
and altitude.
1854 – John Snow demonstrated that the risk of mortality due to cholera was
related to the drinking water provided by a particular supplier in London. He used
a “natural experiment” to test his hypothesis. In another study conducted by Snow
in 1854, he linked an epidemic of cholera to a specific pump, the “Broad Street
Pump”. According to literatures, Snow removed the handle of that pump and
aborted the cholera epidemic.
Despite the great scientific advances that have reduced morbidity and mortality from
communicable diseases over the past decades, communicable diseases continue to
account for a major proportion of acute illness, even in technologically advanced
countries, though the types of diseases vary from place to place. Some important aspects
of infectious diseases are discussed below.
2
DCH/AAU: Epidemiology Note 3
The process begins with exposure to the causative agent capable of causing disease.
Without medical intervention, the process ends with recovery, disability, or death. Most
disease has a characteristic natural history, although the time frame and specific
manifestations of disease may vary from individual to individual. The usual course of a
disease may be halted at any point in the progression by preventive and therapeutic
measures, host factors, and other influences. The stages in the natural history of disease
are shown in Figure 1.
Figure 1
NATURAL HISTORY OF DISEASE
USUAL TIME
OF DIAGNOSIS
PATHOLOGIC ONSET OF
EXPOSURE CHANGES SYMPTOMS
Environment: environmental factors are extrinsic factors which affect the agent and the
opportunity for exposure. Physical factors such as geology, climate, and
physical surrounding (e.g., maternal waiting home, hospital); biologic
factors such as insects that transmit the agent; and socioeconomic factors
such as crowding, sanitation, and the availability of health services.
3
DCH/AAU: Epidemiology Note 4
Not all associations between exposure and disease are causal. A cause of a disease can be
defined as a factor (characteristic, behaviour, event, etc.) that influences the occurrence of
disease. If disease does not develop without the factor being present, than we term the
causative factor “necessary”. If the disease always results from the factor, then we term
the causative factor “sufficient”.
Example: Tubercle bacilli is a necessary factor for tuberculosis.
Rabies virus is sufficient for developing clinical rabies.
Gigure 2
EPIDEMIOLOGIC TRIANGLE AND TRIAD (BALANCE BEAM)
Agent
Agent Host
Host Environment
Environme
In recognition of the multifactorial nature of most diseases, several other models have
been proposed. Causal pie model is one of the models, which takes into account multiple
factors, which are important in causation of disease. In the causal pie model., the factors
are represented by pieces of the pie called component causes, as shown in figure 3:
Figure 3
ROTHMAN’S XAUSAL PIES: CONCEPTUAL SCHEME FOR DISEASE CAUSATION
4
DCH/AAU: Epidemiology Note 5
E. Chain of Infection
1. Causative agent
2. Reservoir host
3. Portal of exit
4. Mode of transmission
5. Portal of entry
6. Susceptible host.
Figure 4
CHAIN OF INFECTION
5
DCH/AAU: Epidemiology Note 6
6
DCH/AAU: Epidemiology Note 7
Figure 5
TIME COURSE OF A DISEASE
IN RELATION TO ITS CLINICAL EXPRESSION AND
COMMUNICABILITY
Symptomatic
Clinical case
Clinical expression
Convalescent carrier
Incubatory carrier
Clinical
Threshold
Asymptotic
Asymptomatic carrier
Chronic carrier
Time
The mechanism by which the agent escapes from a reservoir host and enter
into a susceptible host is referred as mode of transmission. There are two
major modes:
7
DCH/AAU: Epidemiology Note 8
8
DCH/AAU: Epidemiology Note 9
1. Expected levels
9
DCH/AAU: Epidemiology Note 10
I. Disease Classification
10
DCH/AAU: Epidemiology Note 11
Infectiousness pathogenicity
virulence
(infection rate)(clinical-to- (case-
fatality rate,
subclinical hospitalization
rate)
ratio)
J. Variation in Severity of Illness
11
DCH/AAU: Epidemiology Note 12
Figure 7
The Spectrum of Illness from Communicable Disease
12
DCH/AAU: Epidemiology Note 13
13
DCH/AAU: Epidemiology Note 14
A. Descriptive Epidemiology
14
DCH/AAU: Epidemiology Note 15
Descriptive Analytic
Characterize disease Concerned with the
occurrence by time, search for causes and
place and person. effects.
B. Analytic Epidemiology
15
DCH/AAU: Epidemiology Note 16
without the characteristic to develop a certain health problem, we say that the
characteristic is associated with that health problem. Thus analytic epidemiology is
concerned with the search for causes and effects, or the why and how. We use analytic
epidemiology to quantify the association between exposures and outcomes and to test
hypotheses about causal relationships.
The principles of analytic epidemiology are applicable to all types of disease, whether
acute or chronic, infectious or non-infectious (even to states of health, behaviours,
phenomena). They are most often associated, however, with studies of chronic disease.
This reflects the difference between chronic and acute disease in terms of: 1) duration of
latency (short for actue, long or variable (or obscure) for chronic disease, 2) magnitude of
incidence (uaually high for acute disease, low for chronic), and complexity of causation
(single causes common for acute disease, multifactorial etiologies for chronic). Because
of these characteristics, chronic diseases more often require relatively elaborate studies to
establish causes, while acute disease can often be linked to its cause through simple
descriptive or short-term cross-sectional studies.
Analytic epidemiology uses two categories of studies to understand causes and effects: 1)
experimental studies and 2) observational studies. In an experimental study, we
determine the exposure status for each individual (clinical trial) or community
(community trial); we then follow the individuals or communities to detect the effects of
the exposure. In an observational study, which is more common, we simply observe the
exposure and outcome status of each study participant.
Two types of observational studies are the cohort study and the case-control study. A
cohort study is similar in concept to the experimental study, except that we observe the
exposure status rather than determining it. Cohort studies categorize subjects on the basis
or their exposure and observe the frequency of disease occurrence. Case-control studies
enroll a group of people with disease (“cases”) and a group without disease (“controls”)
and compare their patterns of previous exposures to risk factors.
2. Measures of association
Rate ratio
Etiologic fraction
16
DCH/AAU: Epidemiology Note 17
The number of cases in a given community can give more epidemiologic sense if they are
related to the size of the population. Such tie of the number of cases with the population
size can be determined by calculating ratios, Proportions, and rates. These measures
provide useful information about the probability of occurrence of health events,
population at a higher risk of acquiring the disease. They are also important in designing
an appropriate public health interventions.
Rate: measures the occurrence of an event in a population over time. The time
component is important in the definition. Rates are often proportions. Rates must: 1)
include persons in the denominator who reflect the population from which the cases in
the numerator arose: 2) include counts in the numerator which are for the same time
period as those from the denominator; and 3) include only persons in the denominator
who are “at risk” for the event.
When we call a measure a ratio we usually mean a nonproportional ratio. When we call a
measure a Proportion, we usually mean a proportional ratio that doesn’t measure an
event over time. When we use the term rate, we frequently refer to a proportional ratio
that does measure an event in a population over time. The following table depicts the
common uses of the three measures
17
DCH/AAU: Epidemiology Note 18
The frequency of health related events are measured by risk, prevalence and incidence
rate.
Incidence: measures the rapidity with which newly diagnosed patients develop over
time.
18
DCH/AAU: Epidemiology Note 19
Crude rates apply to the total population of a given area. Specific rates apply to specific
subgroups in the population (such as by age, sex, or occupation) or specific diseases.
Adjusted rates and age-specific rates are often used to permit comparison of mortality
rates in populations which differ in age structure. Mortality rates computed with
adjustment techniques are called age-adjusted or age-standardized mortality rates.
C. Measures of Association
A. Rate Ratio
Measures of association between risk factors and disease are often calculated from data
presented in a two by two tables:
EXPOSURE DISEASE
YES (+) NO(-)
YES (+) A B
NO (-) C D
By convention, the capital letters (A,B,C,D) designate study populations (e.g., in cohort
studies) defined by risk exposure and disease occurrence. The small letters (a,b,c,d)
represent samples of populations (e.g., in case-control studies), usually of unknown and
different sampling frequencies.
The relative risk or risk ratio compares the risk of some health-related event (often
disease or death) in two groups, typically in persons exposed to the disease to those not
exposed:
_A_ _C_
A+B C+D
_a_
Odds Ratio = _c_ = _ad_
19
DCH/AAU: Epidemiology Note 20
_b_ bc
d
When the health outcome is uncommon, the odds ratio provides a good approximation of
the relative risk or risk ratio. The odds ratio is also useful in analysis of data from case-
control studies, since the size of the control group is arbitrary and the true size of the
population from which the cases come is usually not known. Under these circumstances,
we cannot calculate rates or the relative risk. The relative risk can, nonetheless, be
approximated by calculating the odds ratio, using only the data in the cells of a two-by-
two table.
Since cases of disease in most chronic disease studies represent only a small fraction of
exposed and unexposed populations, B is about equal to A+B and D to C+D. The formula
can, under these circumstances, be simplified as follows.
_A_ _A_
A+B = _B_ = _AD_
_C_ _C_ BC
C+D D
In calculating relative risk, one need only consider discordant case-control pairs,
represented by g (case exposed, control not exposed) and by f (control exposed, case not
exposed):
g = ___case exposed___
control not exposed
f= __control exposed__
case not exposed
20
DCH/AAU: Epidemiology Note 21
B. Etiologic Fraction
The attributable risk is the difference between the disease rate in exposed persons (or in
the total population) and the rate in non-exposed:
_A_ - _C_
A+B C+D
The attributable proportion, also known as the attributable risk percent, is a measure of
the public health impact of a causative factor. In calculating this measure, we assume that
the occurrence of disease in a group not exposed to the factor under study represents the
baseling or expected risk for that disease. Thus, we attribute any risk above that level in
the exposed group to their exposure. It represents the expected reduction in disease if the
exposure could be eliminated. The calculation is shown in the summary table.
Relative risk (RR): estimates the magnitude of the association between exposure and disease and indicates
the likelihood of developing the disease in the exposed group relative to those who are not exposed.
RR = Risk in exposed
Risk in unexposed
Odds of disease: is a simple ratio, not a proportion. Indicates odds of diseased relative to the exposure
status.
Odds Ratio (OR): is the odds in the exposed over the odds in the unexposed. Some people call it cross
product.
21
DCH/AAU: Epidemiology Note 22
Attributable Risk Percent (AR%) among exposed: estimate the proportion of disease among the exposed
that is attributable to the exposure, or the proportion of the disease that could be prevented by eliminating
the exposure.
= OR – 1 X 100
OR
Population Attributable Risk (PAR) is the risk in total population minus risk in the non-exposed. Estimate
the excess rate of disease in the total study population that is attributable to the exposure.
Population Attributable risk Percent (PAR%) Estimate the proportion of disease in the study population
that is attributable to the exposure and thus could be eliminated if the exposure were eliminated.
2. Positive association between the exposure and the disease (i.e., more exposure, more
disease)
Attributable risk > 0
Relative Risk >1
3. Negative association between the exposure and the disease (i.e., more exposure, less
disease)
Attributable risk < 0 (negative)
Relative risk < 1 (a fraction)
22
DCH/AAU: Epidemiology Note 23
Positive association with female CHA and negative association with male CHA.
DESCRIPTIVE ANALYTIC
Dealing with population Observational studies
Correlational or ecological Case-control
Cohort
Dealing with individuals
Case report or series Intervention studies
Cross sectional survey
A. Descriptive studies
Mainly concerned with the distribution of diseases with respect to time, place
and person.
Useful for health managers to allocate resource and to plan effective
prevention programmes.
Useful to generate epidemiological hypothesis, an important first step in the
search for disease determinant or risk factors.
Can use information collected routinely which are readily available in many
places. So generally descriptive studies are less expensive and less time-
consuming than analytic studies.
It is the most common type of epidemiological design strategy in medical
literature.
There are three main types:
Correlational
Case report or case series
Cross-sectional
23
DCH/AAU: Epidemiology Note 24
e.g. Hypertension rates and average per capita salt consumption compared between
two communities.
Average per capita fat consumption and breast cancer rates compared between
two communities.
Mortality from CHD in relation to per capita cigarette sales among the regions of
Ethiopia.
Strength: Can be done quickly and inexpensively, often using available data.
Limitation:
i. Inability to link exposure with disease. Data on exposure and outcome are
not linked at the individual level. For example, in a society with high fat
intake, perhaps it is the individual women with low intake that get breast
cancer. Again in the association with the reduced mortality from cervical
cancer and PAP smear screening, it is difficult to know whether the
reduction is really in those women who were screened by smear or
otherwise.
24
DCH/AAU: Epidemiology Note 25
iii. It may mask a non-linear relationship between exposure and disease. For
example alcohol consumption and mortality from CHD have a non-linear
relationship (the curve is “J” shaped), but this type of relationship is
impossible to demonstrate in correlational studies.
Describes the experience of a single or a group of patients with similar diagnosis. Has
limited value, but occasionally revolutionary.
E.g. 5 young homosexual men with PCP seen between Oct. 1980 and May
1981 in Los Angeles arose concern among physicians. Later, with further
follow-up and thorough investigation of the strange occurrence of the
disease the diagnosis of AIDS was established for the first time.
Limitations: Report is based on single or few patients, which could happen just by
coincidence. Lack of an appropriate comparison group.
Information about the status of an individual with respect to the presence or absence of
exposure and disease is assessed at the same point in time. Easy to do-many surveys are
like this.
For factors that remain unaltered overtime, such as sex, race or blood group, the cross-
sectional survey can provide evidence of a valid statistical association.
Useful for raising the question of the presence of an association rather than for testing a
hypothesis.
Limitation: “chicken or egg” dilemma – difficult to know which occurred first, the
determinant/ exposure or the outcome. Therefore, difficult to distinguish whether the
exposure preceded the development of the disease or whether presence of the disease
affected the individual’s level of exposure.
25
DCH/AAU: Epidemiology Note 26
contraception are more likely to use it. So you may want to educate
women about it, believing that this will lead to higher rate of use. The
problem is, did the women know about it and then start to use it, or did
they learn about it because they were using it?
B. ANALYTIC STUDIES
i. Cohort
Subjects are selected by exposure, or determinants of interest, and followed to see
If they develop the disease or outcome interest.
E.g. Take Awarjas with trained manager and untrained manager and untrained
managers and follow them to see which group will do better to increase
coverage.
Follow 100 children who received BCG vaccination and another 100 who
didn’t get BCG vaccination and see how many of them get tuberculosis.
E.g. Take people with and without TB, ask them if they ever had BCG vaccination.
Take Awrajas with high and low EPI rates, ask them if their Awraja health
26
DCH/AAU: Epidemiology Note 27
The researcher does something about the disease or exposure and observe the
changes.
Investigator has control over who gets exposure and who don’t. The key is
that the investigator assign into either group, whether it is done randomly or
not.
Always prospective.
E.g. Assign children randomly to get chloroquine or not, and see how many develop
symptomatic malaria.
CASE-CONTROL STUDIES
An epidemiologic research method in which the two study groups are selected on their
disease status. This is a design strategy developed in response to the difficulty of studying
diseases with very long latency period. The design is capable of evaluating the
association of a disease to exposure many years after the actual exposure. Because of this
and its efficiency in time and cost case-control studies have became the most common
analytic design encountered in medical literature. The prototype study on lung cancer and
smoking was done in 1950’s.
In the design of the study always seek for the comparability between cases and controls,
this is the basis for valid conclusion.
Defining Cases:
E.g. “uterine Ca”, before 1940 include Ca of the body of uterus and Ca of the cervix.
“congenital malformation” and drug use-specify malformation
John’s criteria to diagnose rheumatic heart disease is a good example.
If you are not certain about the diagnosis, and if the information collected is
adequate perform analysis separately for cases classified as definite, probable or
possible.
Selection of Cases:
27
DCH/AAU: Epidemiology Note 28
Do not always go for random representation of cases, rather it is better to restrict your
self to cases on which you can get complete and reliable information. Select controls
which are comparable to the cases entered into the study, do not try to represent the
population of all non-diseased persons.
Hospital-based: easy and inexpensive to conduct but it is prone for selection bias.
Prevalent cases:
- Increase sample size available for rare disease.
- Difficult to establish temporal sequence between exposure and outcome. E.g.
Coffee consumption and peptic ulcer disease.
Incident cases:
Selection of controls:
There is no control group that is optimal for all situations. Controls are made for a
particular group of cases, do not try to represent the entire non-diseased population rather
try to achieve comparability between the cases and controls. Selection of controls should
consider besides comparability, practicability and economic impact.
The control series is intended to provide an estimate of the exposure rate that would be
expected to occur in the cases if there were no association between the study disease and
exposure.
Sources of controls.
28
DCH/AAU: Epidemiology Note 29
1. Hospital Controls
Advantages:
- Easily identified and readily available in sufficient number with reduced cost
than population controls.
- More likely than healthy individuals to be aware of antecedent exposures or
events minimize recall bias.
- Controls are also likely to have been subject to the same intangible selection
factors that influence cases to come to this particular physician or hospital
minimize selection bias.
- More likely to be cooperative because they anticipate benefit from their
involvement or might think that its related with their illness reduce bias
due to non-response.
Disadvantages:
- Because they are ill they are different from health individuals in many ways.
Several studies in the West have demonstrated that hospitalized patients are
more likely to smoke cigarette, use oral contraceptive, and be heavy drinkers
of alcohol than non-hospitalized individuals.
- There is danger of altering the direction of association or masking a grue
association between exposure and outcome of interest. Patients with diseases
known to be associated either positively or negatively, with the exposure of
interest, should be excluded from the control series. For example, in studying
the association of cigarette smoking and lung Cancer, individuals with other
respiratory illnesses could not be taken as controls, since smoking is also
known to have some association with other respiratory illnesses.
Advantages:
- Generalizability is possible
- Good when cases are selected to represent affected individuals in a defined
population. For example, if cases to that particular hospital are coming from
a geographically defined area selection of controls from the entire population
could be possible.
Disadvantages:
29
DCH/AAU: Epidemiology Note 30
3. Special controls
Special controls are individuals, which are related to the cases in some way. These are
friends, household members (siblings,…), neighbours,…
Advantages:
- They are healthy
- More likely to be cooperative than members of the general population,
because of their interest in the cases.
- Offer a degree of control over some confounding factors, such as ethnicity,
socioeconomic status, or environment.
Disadvantage:
- If the study factor is likely to be similar to the cases, an underestimate of the
true effect of the exposure of interest may result. E.g. if the study factor is
diet, it will be similar for both cases and controls, if controls are siblings.
A single control group is optimal in most of the times. Add more control groups only
when you are not confident with the control group or when you see a clear deficiency in
your control group or when there is a clear advantage by adding another control.
The optimal control-case ratio is 4:1. As the number of controls per case increases, the
power of the study also increases. But, beyond 4:1, there is only a small increase in
statistical power, which can not justify the expenditure of additional resources.
30
DCH/AAU: Epidemiology Note 31
- The ability to obtain exposure information from records completed before the
occurrence of outcome events is especially valuable. E.g. record of X-ray
during pregnancy in studying its effect on the child (congenital
malformation).
- Ascertainment of exposure should involve defining the part of a person’s
exposure history that could be relevant to the etiology of the disease under
study.
E.g. Smoking & lung Ca – duration of smoking is important than the amount
currently smoked.
Smoking & Myocardial infarction – current smoking is most important.
Collect information in such a way that it allows you to identify the most
appropriate time window for the evaluation of the possible harmful effects
of an exposure – try to avoid collecting information over too wide a
period, such as “ever use” in order to avoid the inclusion of some period in
time that cannot be causally related to the disease.
Issues in analysis
COHORT STUDIES
A design in which the two groups are defined according to their exposure status to a
suspected risk factor for a disease. the two groups should be free of the study outcome.
Types: There are two types of cohort studies, prospective and retrospective, depending on
the temporal relationship between the initiation of the study and the occurrence of
the disease.
1. Prospective- At the beginning of the study the outcome has not yet occurred.
Regarded as more reliable than the retrospective, if the sample size is
large and follow-up complete.
2. Retrospective- Both exposure and outcome status have occurred at the beginning of
the study. Efficient in cost and time. Often uses of data collected for
other purposes, so information obtained might be incomplete and non-
comparable for all subjects.
Both exposure and outcome have occurred before the beginning of the study
31
DCH/AAU: Epidemiology Note 32
Exposed
Retrospective Prospective
Outcome Unexposed Outcome
Selection of exposed group should consider scientific and feasibility issues which
include:
Selection of high risk group also allows the evaluation of a rare disease.
Although cohort studies are in general not optimal for the evaluation of rare
diseases, if the outcome of interest is relatively common among those
exposed; i.e., if the attributable risk percent is high the design can be used
efficiently.
Always attempt to select a control group which is comparable to the characteristics of the
exposed population. There is no single optimal control group that can be used for any
circumstance.
Source of data
The major consideration should be the availability of accurate and complete information
on exposure and outcome of interest in the study groups in a way that is comparable to
both.
Exposure ascertainment:
Advantages:
- Can make available information for high proportion of cohort.
- Relatively inexpensive to obtain.
- Allow objective and unbiased classification of exposure status.
32
DCH/AAU: Epidemiology Note 33
Disadvantages:
- Information on exposure level may be insufficient.
- May not contain adequate information on potential confounders.
Advantages:
- enables to record exposure information that are not routinely recorded,
particularly lifestyle factors.
Disadvantages:
- Potential for information bias, particularly recall. In such situations, where
objective sources can not be used, it is important that information is obtained
in a comparable manner for all participants.
Outcome ascertainment:
With adequate consideration to the resources available for the study, the aim is to obtain
complete, comparable and unbiased information on the subsequent health experience of
every study subject. One or a combination of the following sources could be used: routine
surveillance, death certificate, periodic health examination, autopsy records, hospital
records surveillance, death certificate, periodic health examination, autopsy records,
hospital records, etc.
Always try to have a firm outcome criteria and standard diagnostic procedure which are
equally applied for exposed and non-exposed individuals. Do not do any diagnostic
examination only for one group, because the difference which might be observed could
be just due to the greater opportunity offered to be diagnosed.
Follow-up
This the major challenge in cohort studies, as well as the major cost in terms of time.
Unless complete or nearly complete information could be obtained the results might be
un-interpretable. If the loss to follow-up is not comparable between the two exposed
groups, this will also be a source for bias. Therefore, if there is a need for long follow-up
period, the mechanism to achieve complete follow-up should be thought carefully in the
planning of the study.
Analysis
33
DCH/AAU: Epidemiology Note 34
Role of bias:
Random misclassification or error unrelated to the outcomes of interest may not affect
comparability, rather it dilute or underestimate any true association that may exist
between the exposure and outcome. As a result, the observed RR estimate will always be
biased towards the null value of 1. On the other hand differential misclassification can
result in a biased risk estimate that is either an underestimate, an over estimate, or, by
chance, the same as the true measure of association.
Because of the difficulty to know which factors are related to loss, the best way to
eliminate bias is by reducing loss to follow-up to an absolute minimum.
For losses:
- try to get at least mortality status from other sources.
- Examine previously collected data to determine whether there are systematic
differences between the losses and follow-ups.
- Indirectly calculate exposure-disease association, assuming the two extreme
outcomes. One assuming all those who were lost to follow-up developed the
outcome of interest and the other assuming that none developed the outcome
–this provides a range within which the true association will lie. If losses to
follow-up are large, the observed range will be so wide as to provide little
useful information.
Effect of non-participation
This does not affect validity unless non-response is related to both the exposure and other
risk factors for the outcome under study. The effect of the difference is mainly on
generalizability of the study results.
34
DCH/AAU: Epidemiology Note 35
Table 4.2 Advantages and Limitations of Cohort and Case-Control Study Designs.
Case-Control Cohort
Advantages
Optimal for the evaluation of Valuable when the exposure is rare
RARE disease
Can examine multiple effects of a
Can examine multiple etiologic single exposure
factors for a single disease
Can elucidate temporal relationship
Quick and inexpensive
Allows direct measurement of risk
Relatively simple to carry out
Minimize bias in ascertainment of
Guarantee the number of persons exposure
with cases.
Limitations
Inefficient for the evaluation of rare Inefficient in evaluation of rare diseases
exposure
Expensive
Can not directly compute risk
Time consuming
35
DCH/AAU: Epidemiology Note 36
OR = ad
bc
= OR-1 X 100
OR
36
DCH/AAU: Epidemiology Note 37
RR = Risk in Exposed
Risk in unexposed
= AR – 1 X 100
Ie
37
DCH/AAU: Epidemiology Note 38
= ______________PAR_____________ X 100
Incidence rate of disease in population
Rate Ratio: measures the strength of association between an exposure and disease and
provide information that can be used to judge whether a valid observed
association is likely to be causal.
Attributable Risk: measures the public health impact of an exposure, assuming that the
association is one of cause and effect.
Relative and attributable risks of mortality from lung cancer and coronary heart disease
among cigarette smokers in a cohort of British male physicians
38
DCH/AAU: Epidemiology Note 39
The above study demonstrated a 14-fold increased death rate from lung cancer among
smokers compared with nonsmokers. The relative risk of CHD mortality among current
smokers compared with nonsmokers was 1.6. Thus, cigarette smoking is a much stronger
risk factor for mortality from lung cancer than coronary heart disease. However, if
smoking is causally related to both diseases, the elimination of cigarettes would prevent
far more deaths among smokers from coronary heart disease than from lung cancer, as
shown by the attributable risks of 256/100,000 and 130/100,000, respectively. The
explanation for this is that while death from lung cancer is a relatively rare occurrence,
accounting for only 10 deaths/100,000 population each year among nonsmokers, the
annual death rate of coronary heart disease in that same group is 413/100,000.
Consequently, even a 60% increased risk of CHD mortality associated with cigarette
smoking will affect a much larger number of people than a 14-fold increased risk of death
from lung cancer. Thus, the potential public health impact of smoking cessation on
mortality will be far greater for coronary heart disease than for lung cancer.
INTERVENTION STUDIES
This is an epidemiologic design that closely resembles the controlled experiment used in
basic science researches, and can produce high quality data if done properly. The main
distinction from other types of analytic studies is that individuals are allocated into
experiment or control group by the investigators.
Classification
1. Based on population
A. Clinical trial - usually performed in clinical setting and the subjects are patients.
B. Field trial - used in testing medicine for preventive purpose and the subjects
are healthy people. E.g. vaccine trial
C. Community trial - unit of the study is group of people/community. E.g. fluoridation
of water to prevent dental caries.
2. Based on design
39
DCH/AAU: Epidemiology Note 40
C. Randomized controlled - there is control group and allocation into either group is
randomized.
3. Based on objective
A. Phase I - trail on small subjects to test a new drug with small dosage to determine
the toxic effect.
B. Phase II - trial on small group to determine the therapeutic effect.
C. Phase III - study on large population – usually a randomized control trial.
Intervention studies to represent the “gold standard” for epidemiologic research should
consider the following:
Reference population: The general group to whom investigators expect the results of the
40
DCH/AAU: Epidemiology Note 41
Experimental population: The actual group in which the trial is conducted. It is preferable
that if this group is not difference population for the sake of
generalizability, but this should not be a concern.
Allocation into either group must be done after determining eligibility and getting
consent. It is always advantageous to do the allocation at random.
Subjects may decline from the treatment protocol for various reasons after randomization,
and this related to the length of time that subjects are expected to adhere to the
intervention, as well as to the complexity of the study protocol. It is always important to
obtain as complete follow-up information as possible since they will be included in the
primary analysis.
41
DCH/AAU: Epidemiology Note 42
Assessment of compliance:
Noncompliance will decrease the statistical power of a trial to detect any true effect of the
study intervention. Therefore, to see its effect compliance levels in any study must be
measured. Measuring compliance is not easy, and all the measures available have
inherent limitations. Some of the measures are.
. Self-report, the simplest and the only way to assess behavioural modification and
exercise programs.
. Pills count – ask participants to bring unused pills to each clinic visit, this may
eliminate inaccuracies due to poor memory, it assumes that all the unreturned
pills has been ingested.
. Biochemical tests
. used to validate self-report
. objective but expensive and logistically difficult – Riboflavin is a safe
biochemical marker that has been used added in the treatment. But, can
only reflect the ingestion of the pills the preceding day or two and thus
can not be used as reliable measure for long-term compliance.
It is inevitable that some portion of participants in a trial will become noncompliant
despite all reasonable efforts. In any case, it is important to obtain as complete follow-up
information as possible since they will be included in the primary analysis.
Ascertainment of outcome
Use uniform ascertainment of outcome for complete follow-up period for all study
subjects. To eliminate a possible bias, maintain a high level of follow-up and reduce the
proportion of outcomes that are not ascertained to the minimum and comparable between
the two groups. Follow-up is short in assessing the effect of acute disease and long in
assessment of chronic disease outcomes. The difficulty in maintaining complete
ascertainment of outcome increases with increasing length of follow-up.
Placebo- an inert agent indistinguishable from the active treatment. Use of placebo
minimize bias in the ascertainment of both subjective disease outcomes and
side effects.
Placebo effect: tendency for individuals to report favourable response to any therapy
regardless of the physiologic efficacy of what they received.
42
DCH/AAU: Epidemiology Note 43
The use of placebo ensures that all aspects of the intervention offered to participants are
identical except for the actual experimental treatment. With no placebo, it is impossible
to tell wether subjective outcomes are due to the actual trial treatments, to the extra
attention participants receive, or merely to their belief that the treatment will help.
The primary strength of a double-blind design is to eliminate the potential for observation
bias. Of course, a concomitant limitation is that such trials are usually more complex and
difficult to conduct. Circumstances in which double-blinding is not possible are
evaluation of programs involving substantial changes in life-style, such as exercise,
cigarette smoking or diet, surgical procedures, or drugs with characteristics side effects.
- subjects who are not on the new or experimental program may become
dissatisfied and dropout of the trial, thus resulting in differential compliance
or loss to follow-up.
- Knowledge of the intervention to which group the participant has been
assigned might raise the potential for observation bias in the reporting of side
effects or assessment of outcome.
Randomization
Use of placebo
Double Blinding
To assure the welfare of the participants is protected, interim results should be monitored
by a group that is independent of the investigators conducting the trial. Consider
termination if the interim results indicate a clear and extreme benefit on the primary end
point due to intervention, or if one treatment is clearly harmful. It would also be unethical
to stop a trial prematurely based solely on emerging trends from a small number of
patients – the aim must be to achieve an equitable balance between, on the one hand,
protection of randomized participants against real harm and, on the other, minimizing the
risk of mistakenly modifying or stopping the trial prematurely.
43
DCH/AAU: Epidemiology Note 44
The statistical power of a trial to detect a postulated difference between treatment groups,
if one truly exists, is dependent on:
1. Sample Size
Trials with inadequate sample size might have a great potential for scientific harm –
could be as a result of misinterpretation. Always its advisable to take sample large
enough to detect small to moderate (10-20%) benefit or differences that resulted from the
intervention.
There are at least two major strategies to obtain adequate numbers of end points:
a. Selection of a high-risk population
It is always better to consider that the actual rate of occurrence of end points
will be less than the projected level, which could be due to the low incidence
of the outcome of interest in the volunteer study population, this is referred as
“healthy volunteer effect” – the only way to compensate for this deficit is to
extend the length of follow-up to get more events.
Secular changes in disease rates during the course of the trial might be
sometimes as great as that due to the intervention. E.G. During the decade in
which MRFIT trail was conducted, the entire U.S population including all
MRFIT participants, experienced a marked 25 to 30% decline in Coronary
Heart Disease (CHD) mortality. As a result, the expected numbers of deaths in
44
DCH/AAU: Epidemiology Note 45
the trial was less by two-third, so the follow-up was extended to increase the
number of end-points (the outcome).
Consider the postulated mechanism by which the study agent (the exposure)
exerts its effect in deciding the length of follow-up period. That is, how long
will it take for the study agent to exert its effect on the end result.
3. Effect of Compliance
One strategy to increase compliance is to use “Run-in or Wash out” period prior to the
actual randomization- all participants receive either the active treatment or the placebo
for a number of weeks or months before formal randomization to a treatment group. The
only limitation to this strategy is the limitation to generalize study into reference or
general population, but the primary goal of atrial must be to attain a valid result.
Basically the same with analysis of cohort studies. The fundamental comparison to
estimate the true benefit of the intervention program should be obtained through
analysing the data by intention to treat “once randomized, always analyzed” – so always
maintain high level of compliance, keep losses to follow-up at a minimum, and collect
information of all randomized subjects.
Reasons:
1. Noncompliance may be related to factors that also affect the risk of the outcome
under the study, and failure to analyze data on all randomized participants could
introduce bias. In most studies, perfect compliers represent only a fraction of the total
study population.
2. Analysis of compliers data does not address the actual research question posed in an
intervention study. First, it is only the entire groups allocated by randomization that
are truly comparable – so preserve the power of randomization by analysing the entire
45
DCH/AAU: Epidemiology Note 46
3. Rule out other possible alternative explanations for the observed findings. Alternative
explanations for the observed result in any analytic epidemiological study include:
3.1. Chance
. obtaining adequate sample size for the study could reduce the likelihood of
chance as a possible explanation.
. Statistically significant finding leave little room for chance.
3.2. Bias
. Selection bias is best eliminated by randomization
. information bias can be eliminated by:
. using blinding procedures
. using standard and comparable exposure and outcome ascertainment in both
groups.
3.3. Confounding
. ways to control for confounding include:
. use appropriate analytic tools to control known confounding factors –
multivariate analysis
. control for known and unknown cofounders can be best achieved by
randomization
. matching if properly applied, is another method used for control of known
cofounders
. compare basic socio-demographic characteristics to assure that balance was
achieved.
OBSERVED ASSOCIATION
Could it be due to
Selection or
Measurement bias?
46
DCH/AAU: Epidemiology Note 47
NO
Could it be due
To confounding?
NO
Could it be a
Result of chance?
PROBABLY NOT
Could it be causal?
Apply guidelines
And make judgement
Accuracy of Measurement
Validity is the extent to which data collected actually reflect the truth. The concepts of
sensitivity (ability to detect true positive) and specificity (ability to detect true negatives)
can be used to characterize the validity of a measure (“measurement validity”). Study
results are also described as “valid” when there is no systematic misrepresentation of
effect or “bias” (“validity in the estimation of effect”). Validity is often described as
internal or external.
Internal validity concerns the validity of inferences that do not proceed beyond
the target population for the study. Internal validity is threatened when the
investigator does not have sufficient data to control or rule out competing
explanations for the results.
47
DCH/AAU: Epidemiology Note 48
Precision, on the other hand, describes the extent to which random error (i.e., sampling
variation and the statistical characteristics of the estimator) alters the measurement of
effects. Misclassification may result in problems with either validity (due to systematic
misclassification bias attributable to methodolgical aspects of study design or analysis) or
precision (due to random misclassification error attributable to sampling variation).
Random misclassification errors always bias measures of relative risk toward one.
Systematic misclassification bias can either increase or decrease the strength of the
measured association.
A. Bias
Bias may result from systematic error (or difference between exposed and
unexposed populations or between cases and controls) in the collection, recording,
analysis, or interpretation of data. Evaluating the role of bias as an alternative
explanation for an observed association is a necessary step in interpreting any
study result. Unlike chance (including lack of precision) and confounding, which
can be evaluated quantitatively, the effects of bias are far more difficult to
evaluate and may even be impossible to take into account in the analysis. For this
reason, it is important to design and conduct studies in such a way that every
possibility for introducing bias has been taken into account and to take steps to
48
DCH/AAU: Epidemiology Note 49
1) Selection bias refers to any error that arises in the process of identifying the study
populations. Selection bias can occur whenever the identification of individual
subjects for inclusion in the study on the basis of either exposure (cohort) or disease
(case-control) status depends in some way on the other axis of interest.
1) Berkson’s bias- Case-control studies carried out exclusively in hospital settings are
subject to selection bias attributable to the fact that risks of
hospitalization can combine in patients who have more than one
condition.
4) Loss to follow-up- This is a major source of bias in cohort studies. Persons lost to
follow-up may differ from with respect to both exposure and
outcome, biasing any observed association.
6) Cohort bias – Refers to the biased view of the natural history of disease presented in
survival cohorts, since only the prevalent cases (those with less lethal
disease) are available for study in the latter part of the period of
observation.
49
DCH/AAU: Epidemiology Note 50
1) Interviewer bias This can occur if the interviewer or examiner is aware of the
disease status (in a case-control study) or the exposure status (in
cohort and experimental studies). This kind of bias may affect
every kind of epidemiologic study.
2) Recall bias May result because affected persons may be more (or less) likely
to recall an exposure that healthy subjects, or exposed persons
more (or less) likely to report disease. This source of bias is more
problematic in retrospective cohort or case-control studies.
3) Social desirability bias- Occurs because subjects are systematically more likely to
provide a socially acceptable response.
4) Hawthorn effect Refers to the changes in the dependent variable which may be due
to the process of measurement or observation itself.
6) Regression to the mean- Refers to the statistical phenomenon that extreme values will
tend to “regress” to more average values. Thus a change
from a very high or very low values in the dependent
variable may be attributable to simple random variation,
rather than to changes in the independent variable.
7) Healthy worker bias - Refers to the bias in occupational health studies which
tends to underestimate the risk associated with an
occupation due to the fact that employed people tend to be
healthier than the general population.
9) Length/time bias- Occurs in studies of screening tests for cancer. This occurs due to
the fact that screening tests for cancer tend to detect more slow-
growing tumors with a better prognosis (since faster growing
tumors are more often detected because they cause symptoms). As
a result, the mortality rate of cancers found on screening will
50
DCH/AAU: Epidemiology Note 51
1) Choose study design carefully. If ethical and feasible, a randomized double blind trial
has the least potential for bias. If loss to follow-up will not be substantial, a
prospective cohort study may have less bias than a case-control study. Controls for
case-control studies should be maximally comparable to cases except for the variable
under study.
2) Choose “hard” (i.e., objective) rather than subjective outcomes.
3) “blind” interviewers or examiners wherever possible.
4) Use well-defined criteria for identifying a “case” and use closed-ended questions
whenever possible.
5) Collect data on variables you do not expect to differ between the two groups. If such
a “dummy” variable regarding exposure, for example, in a case-control study shows
an unexpected difference, it may alert you to recall bias.
C. Confounding
2. Associated with the study exposure but not as a consequence of the exposure.
Effect of Confounding
Without prior knowledge of the effect of the variable on the outcome and exposure it is
very difficult to predict the direction of effect of a suspected confounding variable.
However, the effect could be categorized into three:
51
DCH/AAU: Epidemiology Note 52
The list of potential confounders in a study is limited to established risk factors for the
disease of interest, though still some other variables may play a confounding role in the
association it might be difficult to identify them and explain their effects.
Standardization
Stratification/pooling
Multivariate analysis
C. Chance
One of the alternative explanation to the observed association between an exposure and a
disease is chance. Since the general aim of epidemiological studies is to make
generalization about a larger group of individuals on the basis of a sample population it is
always important to evaluate the role of chance or sampling variability in any study
which tries to elucidate association. Evaluation of the role of chance is mainly the domain
of statistics and it involves:
Test of statistical significance quantifies the degree to which sampling variability may
account for the observed results. The “P value” is used to indicate the probability or
likelihood of obtaining a result at least as extreme as that observed in a study by chance
alone, assuming that there is truly no association between exposure and outcome under
consideration (i.e., H0 is true). For medical research, the P value < 0.05 is set
conventionally to indicate statistical significant.
52
DCH/AAU: Epidemiology Note 53
Sample size
The fact implies that even a very small difference may be statistically significant is the
sample size is sufficiently large, and a large difference may not achieve statistical
significance if variability is substantial due to a small sample size. Hence, one cannot
make a definite decision about the role of a factor based only on the P value.
1. Assume that the exposure is not related to disease – state the null hypotheses.
2. Compute a measure of association – relative risk or odd ratio.
3. Calculate chi-square statistical test of significance.
4. For the value of chi-square calculated, look up its corresponding P-value in the
table of chi-squares.
A very small P-value means that you are very unlikely to observe such an
association if the null hypotheses is true.
The confidence interval represents the range within which the true magnitude of effect
lies within a certain degree of assurance. It is more informative than just P value because
it reflects on both the size of the sample and the magnitude of the effect.
D. Establishing a Causal Association
53
DCH/AAU: Epidemiology Note 54
examine the totality of evidence from all available studies and make a judgement about
the likelihood of a cause-effect relationship.
Judgements of causality must first consider whether, for any individual study, the
observed association is valid (i.e., Whether the findings reflect the true relationship
between exposure and disease or may be explained by chance, bias, or confounding) and,
second, whether the accumulated evidence supports a cause-effect relationship. The
validity of an observed association is established by eliminating alternative explanations
of that association. Associations can be:
a) the result of chance variation (i.e., type 1 error). Statistical tests and confidence
intervals can help to evaluate the likelihood of this as an explanation for an
association.
b) The result of bias, or systematic error in the design or conduct of the study.
Examples of how bias can lead to atrifactual (i.e., not real) associations are
presented below.
3. Causal associations, which can be established only when other potential explanations
of the association can be ruled out.
In observational studies, there are many potential confounders and sources of bias, some
of which may remain undetected. The results of one observational study rarely provide
adequate support for concluding that there is a cause-and-effect relationship between an
54
DCH/AAU: Epidemiology Note 55
exposure and a disease. Properly conducted experimental trials do provide direct proof of
causality, yet are often impossible because of ethical considerations.
In the absence of experimental evidence, the following criteria (called the Bradford-Hill
criteria) are used to assess the strength of evidence for a cause-and –effect relationship.
The criteria are listed in descending order of importance:
1. Strength of the Association – The stronger the association, the more likely
that it is causal.
2. Consistency of the Relationship – The same association should be
demonstrable in studies with different methods, conducted by different
investigators, and in different populations.
3. Specificity of the Association – The association is more likely causal if a
single exposure is linked to a single disease.
4. Temporal Relationship – The exposure to the factor must precede the onset
of the disease.
5. Dose-response relationship – The risk of disease often increases with
increasing exposure to a causal agent.
6. Biological plausibility – The hypothesis for causation should be coherent
with what is known about the biology and the descriptive epidemiology of the
disease.
1. Table
Table summarize a set of data arranged in rows and columns. Tables are useful for
demonstrating patterns, exceptions, differences or other relationships. Tables may also
serve as the basis for preparing more visual displays of data, such as graphs and charts,
where some of the detail may be lost. Tables designed to present data should be as simple
as possible. Two or three small tables, each focusing on a different aspect of the data, are
55
DCH/AAU: Epidemiology Note 56
easier to understand than a single large table that contains many details or variables. To
create a table that is self-explanatory, use the following guidelines:
Use a clear and concise title that describes the what, where, and when of the
data in the table. Precede the title with a table number.
Label each row and each column clearly and concisely and include the units
of measurement for the data.
Show totals for rows and columns. If you show percents, also give their total
(always 100).
Explain any codes, abbreviations, or symbols in a footnote.
Note any exclusions in a footnote.
Note the source of the data in a footnote if the data are not original.
Types of tables
This display; the values or categories of one variable and the number and percentage of
people falling into that category.
Measles
Yes No
Yes 10 90
Vaccination No 70 30
C. Three-variable table
56
DCH/AAU: Epidemiology Note 57
Though its use is not much recommended sometimes three variables can be displayed in a
table. At this point it is important to remember that elegant tables are simple and easy to
understand.
Dummy tables are prepared as part of the analysis plan to show how the data will be
organised and displayed once the data is collected. Table shells are complete except for
the data, showing titles, headings and categories. In developing table shells which include
continuous variables such as age, we create more categories than we may later use, in
order to disclose any interesting patterns.
Total
Ordinal variables are presented according to their intrinsic natural categories. For
continuous variables an artificial categories must be created based on the purpose of the
study, however, it is advisable to create more categories (narrow intervals) in order not to
miss any interesting patterns.
Table 4 below lists some standard class intervals used for age (age-groupings) used for
data presentation and analysis.
57
DCH/AAU: Epidemiology Note 58
If no natural or standard class intervals are available, several strategies can be used for
creating intervals. These include:
Divide the data into groups of similar size. To apply this strategy, divide the
total number of observations by the number of intervals you wish to create
(usually 4, but you might start with 8). Next, develop a cumulative frequency
column of a rank-ordered distribution of your data to find where each interval
break would fall.
Base intervals on mean and standard deviation. With this strategy you can
create 3,4 or 6 class intervals.
Divide the range into equal class intervals. This method is most common and
simplest, and is most readily adapted to graphs.
2. Graph
2.1 Histogram
58
DCH/AAU: Epidemiology Note 59
therefore, difficult to construct and are not recommended. A second variable may
be displayed using a histogram by shading each column into the component
categories of the second variable. Epidemic curves (wich are not really “curves”
at all) are frequently displayed as histograms.
3. Chart
Charts are methods of illustrating statistical information using only one coordinate. They
are most appropriate for comparing data with discrete categories other than “place”.
Variables shown in bar charts are either discrete and non-continuous (e.g., race or sex) or
are treated as though they were discrete and non-continuous (e.g., age groups rather than
age intervals). The length or height (bar charts can be presented either horizontally or
vertically) of each bar is proportional to the frequency of the event in that category (and,
therefore, scale breads should not be used). The simplest bar chart is that used to display
data from a one-variable table. This presentation makes it very easy to see the relative
importance of different variables.
A grouped bar chart can be used to illustrate data from two-variable or three-
variable tables, when an outcome has only two separate categories. Bars within a
group are usually adjoining, should be no more than three, and must be illustrated
distinctively and described in a legend.
Deviation bar charts can be used to show deviations in a variable, both positive
and negative, from a variable.
59
DCH/AAU: Epidemiology Note 60
100% component bar charts are useful for comparing the contribution of different
components to each of the categories of the main variable. This is a variation of
the stacked bar chart in which we make all the bars the same height (or length)
and show the components as percents of the total rather than as actual values.
Pie charts are simple, easily understood charts in which the size of the “slices”
show the proportional contribution of each component part. Pie charts are useful
for showing the component parts of a single group or variable. Conventionally,
we begin at 12 O’clock and arrange the component slices from largest to smallest.
Geographic coordinate charts (maps) are used to show the location of events or
attributes. Spot maps used dots or other symbols to show where an event occurred
or a condition exists. Although it can show the geographic distribution of an
event, a spot map does not show risk because it does not take the size of the
population into account. Area maps can overcome this problem by using shaded
or coded areas to show either the incidence of an event in sub areas, or the
distribution of some condition over a geographic area. Area maps can show either
numbers or rates.
Arrange the categories that define the bars, or groups of bars, in a natural
order, such as alphabetical or by increasing age, or in an order that will
produce increasing or decreasing bar lengths.
Position the bars either vertically or horizontally, except for deviation bar
charts, in which the bars are usually positioned horizontally.
Make all of the bars the same width.
Make the length of bars in proportion to the frequency of the event. Do not
use scale breaks.
Show no more than three bars within a group of bars.
Leave a space between adjacent groups of bars, but not between bars within a
group.
Code different variables by differences in bar color, shading, cross-hatching,
etc. and include a legend that interprets your code.
60
DCH/AAU: Epidemiology Note 61
Two principal types are well recognized. These are the common source and
propagated/progressive. The two types can be distinguished by plotting an epidemic
curve. An epidemic which shows the features of both types is referred as mixed.
61
DCH/AAU: Epidemiology Note 62
common source – makes wide peak in the epidemic curve, because of the range of
exposures and range of incubation periods. Intermittent common source – results in an
irregular pattern of the epidemic curve that reflects the intermittent nature of the
exposure.
Outbreak of this type can occur through direct person-to-person transmission or the
transmission could pass through a vector from infected to healthy person.
The epidemic curve would have a successive series of peaks reflecting increasing
numbers of cases in each generation. The epidemic usually wanes after a few generations,
either because the number of susceptible falls below some critical level, or because
intervention measures become effective. In reality, few propagated outbreaks provide a
classic pattern. Diseases with short incubation period and are highly infectious, can create
a rapidly rising and falling epidemic curve similar to that of a point source epidemic.
* When one can not distinguish the two by the epidemic curve, studying the geographic
distribution will help to differentiate them. The propagated epidemics tend to show
geographic spread with successive generations of cases.
3. Mixed Epidemics
Epidemics having the features of both common source and propagated epidemics are
referred as mixed epidemics. For example a common source outbreak may be followed
by secondary person-to-person spread.
Steps of an Epidemic Investigation
Before leaving for the field an investigator must be well prepared to under take the
investigation. Preparations can be categorized into three:
62
DCH/AAU: Epidemiology Note 63
C) Consulation: Clarify your and your team role in the field. Identify local contacts
at the site where the outbreak is reported and arrange where and
when to meet them.
Compare the current number of cases (or incidence) with the past levels of disease in that
community, considering the seasonal variation in the occurrence of the disease, to
determine whether an excessive number of cases have occurred, i.e., compare the
observed number of cases (reported as outbreak) with the expected number of cases in the
area.
Be careful, excess may not always indicate an outbreak. The excess may be due to
changes in local reporting procedures, change in case definition increased interest
because of local or national awareness, or improvements in diagnostic procedures. In area
with sudden changes in population size such as resort areas, college towns, and migrant
farming areas, changes in the numerator (number of reported cases) may simply reflect
changes in the denominator (size of the population) – absolute numbers (without
proportion or rates) should be carefully analyzed.
Review clinical and laboratory findings to establish diagnosis. This is to ensure that the
problem has been properly diagnosed and to rule out laboratory error as the basis for the
increase in diagnosis. If you have any doubt about the laboratory findings review the
laboratory techniques being used with the qualified laboratorian or send specimen for
confirmation to reference laboratory.
Summarize the clinical findings with frequency distribution. They are useful in
characterizing the spectrum of the illness, verifying the diagnosis, and developing case
definitions. Visit as much patients as you can. Conversation with patients are very helpful
in generating hypothesis about disease etiology and spread. Depending on the type of the
problem under investigation establish criteria for labeling persons as “cases”.
63
DCH/AAU: Epidemiology Note 64
Often the case which creates the concern are small and non representative fraction
of the total number of cases. Therefore, epidemic investigators should “cast the
net wide “to determine the geographic extent of the problem and the population
affected by it. In order to do that one must adopt an appropriate methods, for the
setting and disease in question, to identify cases. The two types of surveillance
commonly utilized in an outbreak investigation are:
2. Active surveillance:
64
DCH/AAU: Epidemiology Note 65
Regardless of the disease under investigation collect the following types of information
about every case:
By using a well established descriptive epidemiological tools, like epidemic curve and
spot mapping, an outbreak can be characterized by time, place and person.
Epidemic curve – plots the cases by the time of onset and provides a time frame for
the outbreak investigation.
Spot map – plots the cases by location and shows the geographic spread of
cases.
Formulate the hypotheses based on your characterization of the epidemic by time, place,
and person. The hypotheses should address the source of the agent, the mode of
transmission, and the exposures that caused the disease. Determine the type of epidemic-
common source Vs propagated. Based on characteristics of the epidemic define the
65
DCH/AAU: Epidemiology Note 66
population at the highest risk and consider the possible source(s) of the disease
(infection). The hypotheses should be testable.
Analytic approach:
The analytic technique utilizes the cohort and the case-control approach to identify
possible source of an outbreak. Cohort approach identifies the comparison group based
on exposure status. The case-control method identifies the comparison groups on the
basis of their disease status.
Compute Odds ratio to find association between cases and controls (non-ill)
with regard to exposure to the suspected case-case-control.
Calculate relative risk (attack rate) to determine whether there is association
between exposed and non-exposed – cohort.
Compute statistical tests to determine how likely it is that the investigation
results could have occurred by chance alone, if exposure was not actually
related to disease.
= a/a + b
c/c + d
= a/a + c
b/b + d if a & b are small relative to c & d
= a/c = ad
b/d bc
66
DCH/AAU: Epidemiology Note 67
Passive: inquire physicians or hospitals or both whether they have seen similar
cases.
Although it is discussed late, intervention must start as soon as possible depending on the
specific circumstances. Aim control measures at the weak link or links in the chain of
infection. One might aim control measures at the specific agent, source, or reservoir. For
example, an outbreak might be controlled by destroying contaminated foods. Sterilizing
contaminated water, or destroying mosquito breeding sites, or an infectious food handler
could be removed from the job and treated. (see discussion on epidemic management).
Managing Outbreak/epidemics
67
DCH/AAU: Epidemiology Note 68
disease as well as the source of the outbreak. However, the action can be generally
categorized as presented below to facilitate easy understanding of the strategies.
Humans as reservoir
removal of the focus of infection – e.g., cholecystectomy in a chronic typhoid
carrier.
Isolation of infected persons. This is separation of infected persons from non-
infected for the period of communicability. Not suitable in the control of
diseases in which a large proportion are inapparent infection or in which
maximal infectivity precedes overt illness.
Treatment to make them noninfectious – e.g., tuberculosis.
Disinfection of contaminated objects.
* Cholera, Plaque, and yellow fever are the three internationally quarantable
diseases by international agreement.
* Now quarantine is replaced in some countries by active surveillance of the
individuals – maintaining close supervision over possible contacts of ill persons
to detect infection or illness promptly; their freedom of movement is not
restricted.
68
DCH/AAU: Epidemiology Note 69
* Active immunization, when either the altered organism or its product is given to a
person to induce production of antibodies – EPI.
* Passive immunization, has lesser role in the control of communicable diseases than
active immunization:
Chemoprophylaxis:
Uncovering outbreaks
. Through timely analysis of routine surveillance data, this may reveal an increase in
reported cases or unusual clustering of cases.
. Report from the community, either from the affected group or concerned citizen.
69
DCH/AAU: Epidemiology Note 70
In order to design and implement appropriate control measures assessment of the extent
of the outbreak and the size and the characteristics of the population at risk needs to be
done.
Outbreaks are natural experiment waiting to be analyzed and exploited. It gives a unique
opportunity to study the natural history of diseases. May also help to assess the impact of
control measures and the usefulness of new epidemiology and laboratory techniques.
3. Training opportunity
Public, political, or legal concerns some override scientific concerns in the decision to
conduct an investigation. The call from these parties usually have no scientific basis and
such investigations mostly do not identify a causal link between exposure and disease.
Nevertheless, health departments have to be responsive to public concerns, because it at
least provides an opportunity to educate the public.
70
DCH/AAU: Epidemiology Note 71
We do not limit surveillance to diseases for which we have effective control measures.
Surveillance can be justified for two additional purposes: 1) to learn more about the
natural history, clinical spectrum, and epidemiology of a disease, and 2) to obtain
baseline data which we can use to assess the effectiveness of prevention and control
measures when they are developed and implemented.
Interpretation of surveillance data may also provide the basis for generating hypotheses
and stimulating community health research, test hypotheses regarding the impact of
exposures on disease occurrence. Archival surveillance data have also been used to
develop statistical models of diseases, such as to predict the feasibility of proposed
programs to eradicate measles and polio.
The following are some key sources of surveillance data, not all of which are available in
every country:
Census data
Mortality reports (birth and death certificates, autopsy reports)
Morbidity reports (notifiable disease reports)
Hospital data (discharge diagnoses, surgical logs, hospital infection reports)
Absenteeism records (school, workplace, compensation claims)
Epidemic reports
Laboratory test utilization and result reports
Drug utilization records
Adverse drug reaction reports
Special surveys (e.g., research data, serologic surveys)
Police records (especially for injury, alcohol-related crime)
Information on animal reservoirs and vectors (e.g., for rabies, plague, Lyme
disease)
71
DCH/AAU: Epidemiology Note 72
Types of Surveillance
Passive surveillance is that in which health care providers send reports based on a known
set of rules and regulations.
Active surveillance is that in which public health officials contact providers to solicit
reports of events or diseases. Such active surveillance is usually limited to specific
diseases over a limited period of time, such as after a community exposure or during an
epidemic. Incomplete reporting, especially in passive surveillance systems, is very
common.
Sentinel surveillance uses a pre-arranged sample of reporting sources to report all cases
of one or more conditions. Usually the sample sources are selected to be those most likely
to see cases. Particularly in developing countries, sentinel surveillance provides a
practical alternative to population-based surveillance. Under this strategy, health officials
define homogenous population subgroups and the regions to be sampled. They then
identify institutions that serve the population subgroups of interest, and that can and will
obtain data regarding the condition of interest.
Surveillance systems based on secondary data analysis can make productive use of data
sets collected for other purposes. Data collected for marketing surveys, patient
management records, police records, and other information sources can be exploited as
sources of surveillance data. Such data may be of lesser quality and timeliness than data
collected through systems designed specifically for surveillance.
As with all descriptive epidemiologic data, surveillance data is first analyzed in terms of
time, place, and person. Data are analyzed as rates rather than simply the numbers of
cases reported. When delays occur between diagnosis and reporting, we analyze data by
the date of onset, rather than the date of the report. A critical step before calculating rates
is the identification of the appropriate denominator. Simple tabular and graphic
techniques are used initially to display the data, although sophisticated techniques such as
cluster and time series analysis and computer mapping may also be used.
Surveillance data may be assessed for changes over time by comparing the number of
cases for the current period with the number reported for the same period in each of the
last three years. Secular trends, or long-term trends, are usually analyzed by graphing the
occurrence of disease by year. Any key events, such as initiation or cessation of a control
program. Should be noted on the graph. Changes in the surveillance system (such
changes in diagnostic criteria, reporting requirements, screening programs, or publicity
about the condition) which may influence the appearance of long-term trends should also
be indicated on the graph.
72
DCH/AAU: Epidemiology Note 73
The surveillance data should also be analyzed by place. Even when the secular trend
reveals no increases in overall incidence, analysis by place may reveal a geographic
cluster of cases, which deserves investigation. Analysing surveillance data by the
characteristics by person variables (age, sex, and behavioral risk factors) may also reveals
patterns or clues.
There is no single “threshold” above which disease patterns are different enough from the
expected to warrant further investigation. The excess necessary to trigger action may
depend on the priority assigned to the disease and the interests, capabilities and resources
of the ministry or agency. Public, political, or media attention and pressure, however, can
sometimes make it necessary to investigate minor variations in disease occurrence, which
might no otherwise be pursued.
Apparent increases should be treated as real until proven otherwise. However other
causes of apparent increases should also be considered, including an increase in the
denominator population, improved detection, “batch” reporting, or other changes in the
system itself. Surveillance data should be disseminated to those who provide reports, and
those who need to know for administrative, program-planning, and decision-making
purposes. Newsletters and other reports of surveillance data can also help to maintain the
quality of a surveillance system by providing motivation for continued reporting by
health care providers. Like other epidemiologic data, surveillance data should be
“information for action”, collected only if it is functionally linked with community health
programs.
1) The importance to the public health of the health event under surveillance
a) incidence and prevalence
b) severity (case-fatality or death-to-case ratio)
c) mortality (overall and age-specific mortality rates, years of potential
life lost)
d) health care costs
e) potential for spread
f) preventability
73
DCH/AAU: Epidemiology Note 74
Surveillance systems are never perfect. Understanding the limitations of surveillance data
is important to ensure correct interpretation. The most common limitations of
surveillance systems include:
1) Under reporting (such as due to lack of knowledge of reporting requirements,
negative attitudes toward reporting)
2) Lack of representativeness of reported cases (such as due to a bias toward
reporting severe cases, or increased likelihood of reporting after publicity)
3) Lack of timeliness
4) Inconsistency of case-definitions
These limitations suggest specific steps, which may be taken to improve a surveillance
system. Most commonly, surveillance systems are strengthened by improving awareness
of practitioners, simplification of the process of reporting, frequent feedback to those
reporting, widening the “net” (for example, obtaining reports from laboratories or
schools, rather than relying on physicians), and using active (rather than passive)
surveillance. Remember to “share the data, share the responsibility, share the credit”.
Important Points
Factors related with the selection of disease for surveillance:
Magnitude of the disease
Feasibility of control measures
Need for monitoring and evaluating the performance of a control program
Resource availability
74
DCH/AAU: Epidemiology Note 75
Activities in surveillance:
Data collection and recording
Reporting and notification
Compilation, data analysis, and interpretation
Dissemination of findings for action
Conditions in which active surveillance is appropriate:
For periodic evaluation of ongoing programs
E.g. HIV/AIDS, EPI…
For programs which have time limit of operation
E.g. Small pox
With the occurrence of unusual situations:
When a new disease/event discovered
When investigating a new mode of transmission
When a high-risk period is recognized
When a disease appears in a new geographic area or found to affect a
new subgroup of the population
When previously eradicated disease reappear or low incidence
disease occur at a higher level of endemicity
Features of good surveillance system
Uses a combination of passive and active mechanisms to collect data.
. Emphasize the collection of minimum data in s simplest possible
way.
. To assure quality and enhance compliance make sure that the data
collected is useful for the workers who collect the data.
Timely reporting.
Timely and comprehensive action.
Action must be targeted towards both case detection and treatment and as well as to the
control of the disease.
Strong laboratory services for accurate diagnosis.
X. SCREENING
Although the principles of interpretation of diagnostic and screening tests are classically
applied to tests done in a laboratory, these principles apply equally well to information
obtained from other clinical assessments (such as history or physical examination) as well
as any indicator or indirect measure used in science. Most ordinal scales for variables are
used to simplify the interpretation and use of data. Data are also often expressed as
simple dichotomies (e.g., exposed/ unexposed, ill/well), although realities are rarely so
simple.
Test results are likewise often reported as simply normal or abnormal, although some
may be more or less abnormal (or normal) than others. The following table summarizes
the four possible relationships between a diagnostic test and the actual presence of
disease:
75
DCH/AAU: Epidemiology Note 76
DISEASE STATUS
TEST RESULT
PRESENT ABSENT
Sensitivity is defined as the proportion of people with a disease who have a positive test
for the disease (a/a +c).
Specificity is defined as the proportion of people with a disease who have a positive test
(d/b +d).
A sensitive test is preferable when there is an important penalty for failing to detect a
disease (e.g., when trying to detect a dangerous but treatable condition). Sensitive tests
are also used when the probability of disease is relatively low and the purpose of the
testis to discover possible cases. A sensitive test is; therefore, most helpful when the test
result is negative. Specific tests, on the other hand, are most useful when the test result is
positive, and are often used to confirm a diagnosis, which has been suggested by other
data. A highly specific is preferable when false positive results might have negative
(physical, emotional, or financial) consequences.
Case definitions used in epidemiology may also be characterized by their sensitivity and
specificity. For rare but potentially severe communicable diseases, where it is important
to identify every possible case, health officials use a sensitive or “loose” case definition.
On the other hand, investigators of the causes of a disease outbreak want to be certain
that any person included in the investigation really had the disease. In this case, the
investigator prefers a specific or “strict” case definition.
In theory, the sensitivity and specificity of a test are independent of the prevalence of the
condition being detected. In practice, however, several characteristics of cases (such as
the stage and severity of the disease) may be related to both the sensitivity and specificity
of the test and to the prevalence of the disease, since different kinds of cases are found in
high-and low-prevalence situations. Tests are often assessed to be more valuable than
they actually are, since a positive test result may prompt the health care provider to
continue pursuing a diagnosis, while a negative result may cause a clinician to abandon
further testing.
76
DCH/AAU: Epidemiology Note 77
When assessing the implications of a positive or negative test, the sensitivity and
specificity (which are more useful in deciding whether to perform the test) are no longer
of primary importance.
Positive predictive value (or predicative value positive) (-PV = a/b +b) is the probability
of disease in a person with a positive (abnormal) test result.
Negative predictive value (or predictive value negative) –PV = d/c +b) is the probability
of not having the disease when the test result is negative (normal).Predictive value is
sometimes called posterior or post-test probability.
The predictive value of a test is not a property of the test alone. It is determined by the
sensitivity and specificity of the test and the prevalence of the disease in the population
being tested. Positive results, even for a very specific test, when applied to a population
with a low likelihood of disease, will be largely false positives. Similarly, negative
results, even for a very sensitive test, will be largely false negatives when the test is
performed in a population with a high chance of having the disease.
The criteria for a successful screening program were first summarized in a WHO
publication in 1968. They can be broadened to screening for problems other than human
disease:
1. The problem to be detected should be important enough to be worth detecting.
2. There should be an acceptable intervention, which is effective.
3. The intervention should be feasible and available.
4. There should be a recognizable latent or early “asymptomatic” stage.
5. There should be a suitable test.
6. The test should be acceptable to the population to be tested.
7. The natural history of the condition should be adequately understood.
8. There should be an agreed policy regarding when the intervention is
appropriate.
9. The cost of detecting the problem and its remedy should be reasonable.
10. The screening program should be ongoing, and not a “one-time” effort.
77
DCH/AAU: Epidemiology Note 78
ethical guidelines for epidemiologic studies has also recently been accentuated by the
complex issues raised by research regarding HIV infections and AIDS
Ethical issues often arise as a result of conflict among competing sets of values. Many
situations require careful discussion and informed judgements on the part of
investigators, ethical review committees, administrators, health care practitioners, policy-
makers, and community representatives. Externally sponsored epidemiological studies in
developing countries merit special attention in ethical review.
The purpose of ethical review is to consider the features of a proposed study in light of
ethical principles, so as to ensure that investigators have anticipated and satisfactorily
resolved possible ethical objections, and to assess their response to ethical issues raised
by the study. Not all ethical principles weigh equally. A study may be assessed as ethical
even if a usual ethical expectation, such as confidentiality of data, has not been
comprehensively met, provided the potential benefits clearly outweigh the risks and the
investigators give assurances of minimizing risks. It may even be unethical to reject such
a study, if its rejection would deny a community the benefits it offers. The challenge of
ethical review is to take into account potential risks and benefits, and to reach decisions
which best reflect the consensus of the review committee. Different conclusions may
result from different ethical reviews of the same issue or proposal, and each conclusion
may be ethically reached, given varying circumstances of place and time; a conclusion is
ethical not merely because of what has been decided, but also owing to the process of
conscientious reflection and assessment by which it has been reached.
General ethical principles may be applied at the individual and community levels. At the
level of the individual (micro ethics), ethics governs how one person should relate to
another and the moral claims of each member of a community. At the level of the
community, ethics applies to how one community relates to another, and to how a
community treats each of its members (including prospective members) and members of
other groups with different cultural values (macro ethics). Procedures that are unethical at
one level cannot be justified merely because they ear considered ethically acceptable at
the other.
All research involving human subjects should be conducted in accordance with four basic
ethical principles: 1) respect for persons. 2) beneficence. 3) non-maleficent. And 4)
justice.
78
DCH/AAU: Epidemiology Note 79
Respect for persons incorporates at least two other fundamental ethical principles,
namely:
a) autonomy, which requires that those who are capable of deliberation about
their personal goals should be treated with respect for their capacity for self-
determination: and,
b) Protection of persons with impaired or diminished autonomy, which requires
that those who are dependent or vulnerable be afforded security against harm
or abuse.
Justice requires that cases considered to be alike be treated alike, and that cases
considered to be different be treated in ways that acknowledge the difference. When the
principle of justices is applied to dependent or vulnerable subjects, its main concern is
with the rules of distributive justice. Studies should be designed to obtain knowledge that
benefits the class of persons of which the subjects are representative. The class of persons
bearing the burden should receive an appropriate benefit, and the class primarily intended
to benefit should bear a fair proportion of the risks and burdens of the study.
1. Informed Consent
When individuals are the subject of epidemiologic studies, their individual informed
consent will usually be sought. Consent is informed when it is given by a person who
understands the purpose and nature of the study, what participation in the study requires
the person to don and to risk, and what benefits are intended to result from the study. An
investigator who proposes not to seek informed consent has the obligation to explain how
the study would be ethical in its absence (such as because informed subjects might alter
the behaviour under study or feel needlessly anxious, or because it is public knowledge
that personal data is made available for epidemiologic studies). Consent is not required
for use of publicly available information, although countries and communities differ with
regard to the definition of what information about citizens is regarded as public.
Investigators must provide assurances that strict safeguards will be maintained to protect
confidentiality by minimizing disclosure of personally sensitive information.
79
DCH/AAU: Epidemiology Note 80
When it is not possible to obtain informed consent from every individual to be studied,
community agreement through a representative of a community or group may be sought,
but the representative should be chosen according to the nature, traditions and political
philosophy of the community or group. Approval given by a community representative
should be consistent with general ethical principles. Even if a leader expresses agreement
on behalf of a community, the refusal of individuals to participate has to be respected.
Representatives of a community or group may sometimes be invited to participate in the
design of a study and in its ethical assessment.
Selective disclosure may be used in epidemiologic research, provided that it does not
induce subjects to do what they would not otherwise consent to do. For certain
epidemiologic studies, such non-disclosure is permissible, even essential, so as not to
influence the spontaneous conduct under investigation, and to avoid obtaining responses
that the respondent might give in order to please the questioner.
Prospective subjects may not feel free to refuse requests from those who have power or
undue influence over them. It is ethically questionable whether subjects should be
recruited from among groups that are unduly influenced by persons in authority if the
study can be conducted with subjects who are not in this category.
2. Maximizing Benefit
Part of the benefit that communities, groups, and individuals may reasonable expect from
participating in studies is that they will be told of findings that pertain to their health. A
strategy for communication of study results to policy-makers and to participating
individuals and communities (with due consideration of levels of literacy and
comprehension) should be included in the study protocol. When findings indicated a need
for health care, those concerned should be appropriately advised and arrangements should
be made for treatment or referral. Health professionals have an obligation to advocate
release of study results that is in the public interest. Training of local health personnel in
skills and techniques that can be used to improve health services or research may also be
an important way of ensuring that communities will benefit from the proposed research.
3. Minimizing Harm
80
DCH/AAU: Epidemiology Note 81
health care priorities may constitute harm. Ethical review should also assess the risk of
subjects or groups suffering stigmatization, prejudice, loss of prestige or self-esteem (e.g.
due to being identified as HIV-positive), or economic loss as a result of taking part in a
study. Investigators must be able to demonstrate that benefits outweigh the risks for both
individuals and groups. When a health person is a member of a population or sub-group
at increased risk and engages in high-risk activities, it is unethical not to propose
measures for protecting the population or sub-group.
4. Confidentiality
Research may involve collecting and storing data relating to individuals and groups, and
such data, if disclosed to third parties, may cause harm or distress. Consequently,
investigators should make arrangements for protecting the confidentiality of such data by,
for example, omitting information that might lead to identification of individual subjects,
or limiting access to the data, or by other means. When personal identifiers remain on
records used for a study, investigators should explain why this is necessary and how
confidentiality will be protected.
Unlinked information is that which cannot be linked, associated or connected with the
person to whom it refers. As this person is not known to the investigator, confidentiality
is not at stake and the question of consent does not arise. Linked information may be 1)
anonymous (when the information cannot be linked to the person to whom it refers
except by a code or other means known only to that person, and the investigator con not
know the identity of the person), 2) non-nominal (when the information can be linked to
the person by a code which is not a personal identifier and which is known to the person
and the investigator), and 3) nominal or nominative (when the information is linked to the
person by means of personal identification, usually the name).
5. Conflict of Interest
It is an ethical rule that investigators should have no undisclosed conflict of interest with
their study collaborators, sponsors, or subjects. Conflict can arise when a commercial or
other sponsor may wish to use study results to promote a product or service, or when it
may not be politically convenient to disclose findings. Honesty and impartiality are
essential in designing and conducting studies, and presenting and interpreting findings.
Data must not be withheld misrepresented or manipulated.
81