0% found this document useful (0 votes)
3 views

StatisticsLecture2

This document introduces statistical inference, explaining key concepts such as population, parameter, sample, and statistic, and emphasizes the importance of proper sampling methods to avoid biases. It discusses observational studies and the distinction between correlation and causation, highlighting the need for randomized controlled experiments to establish causation. The document also covers the placebo effect and the role of randomization in ensuring the validity of experimental results.

Uploaded by

thekonan726
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

StatisticsLecture2

This document introduces statistical inference, explaining key concepts such as population, parameter, sample, and statistic, and emphasizes the importance of proper sampling methods to avoid biases. It discusses observational studies and the distinction between correlation and causation, highlighting the need for randomized controlled experiments to establish causation. The document also covers the placebo effect and the role of randomization in ensuring the validity of experimental results.

Uploaded by

thekonan726
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Introduction to Statistics | Lecture 2

What is statistical inference?

What percentage of voters approve of the way the U.S. President is handling his
job? This is difficult to determine exactly as there are more than 250 million people
of voting age in the U.S. But it’s not difficult to estimate this percentage quite well.
Sample 1,000 (say) voters at random. Then use the approval percentage among
those voters as an estimate for the approval percentage of all voters.

1. Population: the entire group of subjects about which we want information

all U.S. voters

2. Parameter: the quantity about the population we are interested in

approval percentage among all U.S. voters

3. Sample: the part of the population from which we collect information

the 1,000 voters selected at random

4. Statistic (estimate): the quantity we are interested in as measured in the


sample

approval percentage among the sampled voters

Even a relatively small sample (100 or 1,000) will produce an estimate that is close
to the parameter of a very large population of 250 million subjects. This is the
reason why statistics is so powerful.

Sampling correctly is very important

It’s tempting to sample 1,000 voters in your hometown. This is a sample of


convenience. This is not a good way to sample because the voters will be different
from the population of U.S. voters. This will introduce bias, i.e. this sampling will
favour a certain outcome.

selection bias: a sample of convenience makes it more likely to sample certain


subjects than others

non-response bias: parents are less likely to answer a survey request at 6 pm


because тhey are busy with children and dinner

voluntary response bias: websites that post reviews of businesses are more likely to
get responses from customers who had very bad or very good experiences
Sampling designs

The best methods for sampling use chance in a planned way:

1. Simple random sample selects subjects at random without replacement

2. Stratified random sample divides the population into groups of similar subjects
called strata (e.g. urban, suburban, and rural voters). Then one chooses a simple
random sample in each stratum and combines these.

Bias and chance error

Since the sample is drawn at random, the estimate will be different from the
parameter due to chance error. Drawing another sample will result in a different
chance error.

Estimate = parameter + bias + chance error

The chance error (sampling error) will get smaller as the sample size gets bigger.
Moreover, we can compute how large the chance error will be. This is not the case
for the bias (systematic error): Increasing the sample size just repeats the error on a
larger scale, and typically we don’t know how large the bias is.

Observational Studies

People who eat red meat have higher rates of certain cancers than people who
don’t eat red meat. This means that there is an association between red meat
consumption and cancer: there is a link between these two.

But this does not mean that eating red meat causes cancer: people who don’t eat
red meat are known to exercise more and drink less alcohol, and it could be the
latter two issues that cause the difference in cancer rates.

This is an observational study: It measures outcomes of interest and this can be


used to establish association. But association is not causation, because there may
be confounding factors such as exercise that are associated both with red meat
consumption and cancer.

Randomized controlled experiments

To establish causation, an experiment is required:

A treatment (e.g. eating red meat) is assigned to people in the treatment group but
not to people in the control group. Then the outcomes in the two groups are
compared. To rule out confounders, both groups should be similar, apart from the
treatment. To this end:

• The subjects are assigned into treatment and control groups at random.
• When possible, subjects in the control group get a placebo: it resembles the
treatment but is neutral. Assigning a placebo makes sure that both groups
are equally affected by the placebo effect: the idea of being treated may have
an effect by itself.
• The experiment is double-blind: neither the subjects nor the evaluators know
the assignments to treatment and control.

The placebo effect

The placebo effect is still not fully understood and is one of the most interesting
phenomena in science. ‘The weird power of the placebo effect, explained’ by Brian
Resnick (7/7/2017).

The logic of randomized controlled experiments

Randomization serves two purposes:

It makes the treatment group similar to the control group. Therefore, influences
other than the treatment operate equally on both groups, apart from differences due
to chance.

It allows to assess how relevant the treatment effect is, by calculating the size of
chance effects when comparing the outcomes in the two groups (see later).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy