StatisticsLecture2
StatisticsLecture2
What percentage of voters approve of the way the U.S. President is handling his
job? This is difficult to determine exactly as there are more than 250 million people
of voting age in the U.S. But it’s not difficult to estimate this percentage quite well.
Sample 1,000 (say) voters at random. Then use the approval percentage among
those voters as an estimate for the approval percentage of all voters.
Even a relatively small sample (100 or 1,000) will produce an estimate that is close
to the parameter of a very large population of 250 million subjects. This is the
reason why statistics is so powerful.
voluntary response bias: websites that post reviews of businesses are more likely to
get responses from customers who had very bad or very good experiences
Sampling designs
2. Stratified random sample divides the population into groups of similar subjects
called strata (e.g. urban, suburban, and rural voters). Then one chooses a simple
random sample in each stratum and combines these.
Since the sample is drawn at random, the estimate will be different from the
parameter due to chance error. Drawing another sample will result in a different
chance error.
The chance error (sampling error) will get smaller as the sample size gets bigger.
Moreover, we can compute how large the chance error will be. This is not the case
for the bias (systematic error): Increasing the sample size just repeats the error on a
larger scale, and typically we don’t know how large the bias is.
Observational Studies
People who eat red meat have higher rates of certain cancers than people who
don’t eat red meat. This means that there is an association between red meat
consumption and cancer: there is a link between these two.
But this does not mean that eating red meat causes cancer: people who don’t eat
red meat are known to exercise more and drink less alcohol, and it could be the
latter two issues that cause the difference in cancer rates.
A treatment (e.g. eating red meat) is assigned to people in the treatment group but
not to people in the control group. Then the outcomes in the two groups are
compared. To rule out confounders, both groups should be similar, apart from the
treatment. To this end:
• The subjects are assigned into treatment and control groups at random.
• When possible, subjects in the control group get a placebo: it resembles the
treatment but is neutral. Assigning a placebo makes sure that both groups
are equally affected by the placebo effect: the idea of being treated may have
an effect by itself.
• The experiment is double-blind: neither the subjects nor the evaluators know
the assignments to treatment and control.
The placebo effect is still not fully understood and is one of the most interesting
phenomena in science. ‘The weird power of the placebo effect, explained’ by Brian
Resnick (7/7/2017).
It makes the treatment group similar to the control group. Therefore, influences
other than the treatment operate equally on both groups, apart from differences due
to chance.
It allows to assess how relevant the treatment effect is, by calculating the size of
chance effects when comparing the outcomes in the two groups (see later).