0% found this document useful (0 votes)
24 views21 pages

Statistics 17 18

Statistics is the study of collecting and analyzing numerical data to make inferences about populations. It involves analyzing data using probability and distributions. Key terms include population, sample, discrete and continuous variables, bias, mean, median, mode, standard deviation, normal distribution, standard error, and hypothesis testing. Hypothesis testing uses probability to determine if differences between sample means are statistically significant or likely due to chance.

Uploaded by

hc4kmx75bk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views21 pages

Statistics 17 18

Statistics is the study of collecting and analyzing numerical data to make inferences about populations. It involves analyzing data using probability and distributions. Key terms include population, sample, discrete and continuous variables, bias, mean, median, mode, standard deviation, normal distribution, standard error, and hypothesis testing. Hypothesis testing uses probability to determine if differences between sample means are statistically significant or likely due to chance.

Uploaded by

hc4kmx75bk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 21

From BSCS: Interaction of experiments and ideas, 2nd

Edition. Prentice Hall, 1970 and Statistics for the


Utterly Confused by Lloyd Jaisingh, McGraw-Hill, 2000
What is statistics?
• a branch of mathematics that provides techniques to
analyze whether or not your data is significant
(meaningful)
• Statistical applications are based on probability
statements
• Nothing is “proved” with statistics
• Statistics are reported
• Statistics report the probability that similar results
would occur if you repeated the experiment
Statistics deals with numbers
Need to know nature of numbers collected
Continuous variables: type of numbers associated with
measuring or weighing; any value in a continuous
interval of measurement.
 Examples:
 Weight of students, height of plants, time to flowering

Discrete variables: type of numbers that are counted or


categorical
 Examples:
 Numbers of boys, girls, insects, plants
Can you figure out…
Which type of numbers (discrete or continuous?)
Numbers of persons preferring Brand X in 5 different
towns
The weights of high school seniors
The lengths of oak leaves
The number of seeds germinating
35 tall and 12 dwarf pea plants
Answers: all are discrete except the 2nd and 3rd examples
are continuous.
Populations and Samples
Population includes all members of a group
 Example: all 9th grade students in America
 Number of 9th grade students at CVHS
 No absolute number
Sample
 Used to make inferences about large populations
 Samples are a selection of the population
 Example: 6th period Accelerated Biology
Why the need for statistics?
 Statistics are used to describe sample populations as estimators
of the corresponding population
 Many times, finding complete information about a population
is costly and time consuming. We can use samples to represent
a population
Sample Populations avoiding Bias
Individuals in a sample population
Must be a fair representation of the entire pop.
Therefore sample members must be randomly selected
(to avoid bias)
Example: if you were looking at strength in students:
picking students from the football team would NOT be
random
Is there bias?
A cage has 1000 rats, you pick the first 20 you can catch for
your experiment
A public opinion poll is conducted using the telephone
directory
You are conducting a study of a new diabetes drug; you
advertise for participants in the newspaper and TV
All are biased: Rats-you grab the slower rats. Telephone-
you call only people with a phone (wealth?) and people
who are listed (responsible?). Newspaper/TV-you reach
only people with newspaper (wealth/educated?) and
TV( wealth?).
Statistical Computations (the
Math)
• If you are using a sample population
– Arithmetic Mean (average)

The sum of all the scores


divided by the total number of scores.

– The mean shows that ½ the members of the pop fall on


either side of an estimated value: mean

http://en.wikipedia.org/wiki/Table_of_mathematical_symbols
Distribution Chart of Heights of 100 Control Plants

Looking at profile of data:


Distribution
What is the frequency of distribution, where are the
data points?
Distribution Chart of Heights of 100 Control Plants
Class (height of plants-cm) Number of plants in each
class

0.0-0.9 3

1.0-1.9 10

2.0-2.9 21

3.0-3.9 30

4.0-4.9 20

5.0-5.9 14

6.0-6.9 2
Histogram-Frequency Distribution
Charts

This is called a “normal” curve or a bell curve


This is an “idealized” curve and is theoretical based on an infinite number
derived from a sample
Mode and Median
Mode: most frequently seen value (if no numbers
repeat then the mode = 0)
Median: the middle number
If you have an odd number of data then the median is
the value in the middle of the set
If you have an even number of data then the median is
the average between the two middle values in the set.
Standard Deviation
An important statistic that is used to measure
variation in biased samples.
S is the symbol for standard deviation
What does “S” mean?
We can predict the probability of finding a pea plant
at a predicted height for example.
S is a valuable tool because it reveals predicted limits
of finding a particular value
Pea Plant Normal Distribution Curve with Std Dev
The Normal Curve and Standard Deviation
A normal curve:
Each vertical line
is a unit of
standard deviation
68% of values fall
within +1 or -1 of
the mean
95% of values fall
within +2 & -2
units
Nearly all
members (>99%)
fall within 3 std
dev units

http://classes.kumc.edu/sah/resources/sensory_processing/images/bell_curve.gif
Standard Error of the Sample Means
AKA Standard Error
The mean, and the std dev help estimate
characteristics of the population from a single
sample
So if many samples were taken then the means of
the samples would also form a normal distribution
curve that would be close to the whole
population.
The larger the samples the closer the means
would be to the actual value
But that would most likely be impossible to
obtain so use a simple method to compute the
means of all the samples
A Simple Method for estimating standard
error

Standard error is the calculated standard deviation divided by the square root
of the size, or number of the population
Standard error of the means is used to test the reliability of the data
Example… If there are 10 corn plants with a standard deviation of 0.2
Sex = 0.2/ sq root of 10 = 0.2/3.03 = 0.006
0.006 represents one std dev in a sample of 10 plants
If there were 100 plants the standard error would drop to 0.002
Why?
Because when we take larger samples, our sample means get closer
to the true mean value of the population. Thus, the distribution of the
sample means would be less spread out and would have a lower
standard deviation.
Probability Tests
What to do when you are comparing two samples
to each other and you want to know if there is a
significant difference between both sample
populations
(example the control and the experimental setup)
How do you know there is a difference
How large is a “difference”?
How do you know the “difference” was caused by
a treatment and not due to “normal” sampling
variation or sampling bias?
Laws of Probability
The results of one trial of a chance event do not affect the
results of later trials of the same event. p = 0.5 ( a coin
always has a 50:50 chance of coming up heads)
The chance that two or more independent events will
occur together is the product of their changes of occurring
separately. (one outcome has nothing to do with the
other)
Example: What’s the likelihood of a 3 coming up on a dice:
six sides to a dice: p = 1/6
Roll two dice with 3’s p = 1/6 *1/6= 1/36 which means
there’s a 35/36 chance of rolling something else…
Note probabilities must equal 1.0
Laws of Probability (continued)
The probability that either of two or more
mutually exclusive events will occur is the sum of
their probabilities (only one can happen at a
time).
Example: What is the probability of rolling a total
of either 2 or 12?
Probability of rolling a 2 means a 1 on each of the
dice; therefore p = 1/6*1/6 = 1/36
 Probability of rolling a 12 means a 6 and a 6 on
each of the dice; therefore p = 1/36
So the likelihood of rolling either is 1/36+1/36 =
2/36 or 1/18
The Use of the Null Hypothesis
Is the difference in two sample populations due to
chance or a real statistical difference?
The null hypothesis assumes that there will be no
“difference” or no “change” or no “effect” of the
experimental treatment.
If treatment A is no better than treatment B then the
null hypothesis is supported.
If there is a significant difference between A and B
then the null hypothesis is rejected...

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy