0% found this document useful (0 votes)
15 views11 pages

Statistics-In-Psychology - Compress Notes

The document provides an overview of various statistical methods used in psychology, including parametric and non-parametric statistics, descriptive and inferential statistics, and specific tests like t-tests, ANOVAs, and regression analyses. It explains the differences between populations and samples, the importance of measures of central tendency (mean, median, mode), and the concept of simple random sampling. Additionally, it highlights when to use each measure of central tendency based on the type of variable.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views11 pages

Statistics-In-Psychology - Compress Notes

The document provides an overview of various statistical methods used in psychology, including parametric and non-parametric statistics, descriptive and inferential statistics, and specific tests like t-tests, ANOVAs, and regression analyses. It explains the differences between populations and samples, the importance of measures of central tendency (mean, median, mode), and the concept of simple random sampling. Additionally, it highlights when to use each measure of central tendency based on the type of variable.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Statistics in Psychology

Parametric Statistics
Statistical methods that estimate the population parameters, such as the standard deviation, on the basis of the
sample data, are called, “parametric statistics”. Parametric analyses should only be used if the DV is normally
distributed.

Non-Parametric Statistics
Non-parametric statistics are NOT used to make assumptions about population distributions. Often used when
data fail to meet the assumptions for parametric analyses; used in the study of proportions & ranks. For
example: (i) DV not normally distributed; (ii) small samples; (iii) unequal sample sizes. Sometimes referred to
as “distribution-free techniques”; they are very valuable in the analysis of ordinal and rank data. They have less
power to detect significant differences between groups.

Descriptive Statistics
Descriptive statistics is the term given to the analysis of data that helps describe, show or summarize data (i.e.
the sample) in a meaningful way such that, for example, patterns might emerge from the data. Descriptive
statistics do not, however, allow us to make conclusions beyond the data we have analyzed or reach conclusions
regarding any hypotheses we might have made. They are simply a way to describe our data.

Inferential Statistics
Inferential statistics allow the researcher to generalize their findings from the sample data to the larger
population. They help assess the strength of the relationship between the independent (causal) variables, and the
dependent (effect) variables. With inferential statistics, we are trying to reach conclusions that extend beyond
the immediate data alone (i.e. our sample). In short, inferential statistics are used to estimate the characteristics
of the larger group (i.e., population).
T-test One-way ANONVA

 A t-test is used when we have 1 IV with 2  A one-way ANOVA is used when we have 1
levels. It estimates whether the population IV with more than 2 levels. It estimates
means under the 2 levels of the IV are whether the population means under the
different. different levels of the IV are different.

 The estimate is based on the difference  An independent one-way ANOVA is used


between the measured sample means. There when there are different participants for each
are two types of t-test. level of the IV (i.e. between participants).

 Independent t-test: between participants/  If the same participants are used for each
independent groups. level of the IV a one-way repeated measures
(i.e. within subjects) ANOVA should be used.
 Paired t-test: within participants/ repeated
measures.

Factorial ANOVAs Correlation

 Factorial ANOVAs are used to test for  Correlation means association - more
differences when we have more than one precisely it is a measure of the extent to which
independent variable (IV). two variables are related.

 Including more than one IV, we can explore  When working with continuous variables, the
the effects of interactions between IVs. correlation coefficient to use is Pearson’s r.
This is a numerical score showing the strength
 The terms ‘IV’ and ‘factor’ are of a correlation.
interchangeable. ANOVAs with more than
one IV are called Factorial ANOVAs. o r = - 1 (perfect negative relationship)
o r = +1 (perfect positive relationship)
 There are three broad Factorial ANOVA o r = 0 (no relationship)
designs:
1. all IVs are between-participants -  Once we’ve determined the relationship
Participants take part in only one (Pearson's r) in our sample, inferential
condition (i.e. independent measures). analyses allow us to determine the probability
of measuring a relationship of that magnitude
2. all IVs are within-participants - if the null hypothesis is true?
Participants take part in all conditions
(repeated measures).  Bivariate linear correlation involves
measuring the linear relationship between two
3. a mixture of between-participant and sample variables.
within-participant IVs - Participants take
part in more than one, but not all  Partial correlation allows us to examine the
conditions. relationship between two variables, while
removing the influence of a third variable.
Regression Analysis Spearman’s rho

 While correlation assesses the relationship  Spearman's rho is a non-parametric


between x and y, regression allows us to correlational analyses (an alternative to
predict y from x. For example, how much Pearson’s r).
does y change as a result of a change in x?
 This test is used to determine if there is a
 Linear regression allows us to assess if the correlation between sets of ranked data
score on x influenced the score on y. (ordinal data) or interval and ratio data that
have been changed to ranks (ordinal data).
 Multiple regression allows us to assess the
effect that several predictor variables (e.g. x1,
x2, x3 etc.) have on the outcome variable (y).

Man-Whitney U (284-290) Wilcoxon T

 Mann-Whitney U is a non-parametric  Wilcoxon T is a non-parametric alternative to


alternative to an independent t-test. a paired t-test.

 1 IV, 2 levels: Between-participant design.  Differences between scores in the two IV


levels are calculated for each participant and
 The test evaluates whether there is a then ranked.
significant difference in the ranks assigned to
the two IV levels.  The test evaluates whether there is a
significant difference in the ranks assigned to
 Counterpart to the T-test; is the most popular the two IV levels.
of non-parametric tests.

 Used when assumptions about population


distributions might be violated or when
sample sizes are very small.

 Is a test of the difference between two


population distributions rather than a test of
the diff. between two means or medians.

 Transformation of the U to Z is appropriate


when each of the samples is of size 8 or
larger.
Chi-square Friedman’s ANOVA

 Also known as Goodness-of-Fit test.  Friedman’s ANOVA is a non-parametric


alternative to a repeated measures one-way
 Deals with a single categorical variable - i.e. ANOVA. .
nominal level data.
 Tells you whether there is a significant
 One-Variable Chi-square calculates the difference, but not which IV levels are
difference between expected and obtained different.
frequencies.

 2x2 Chi-Square (Test for Independence)  Conduct post-hoc tests (Wilcoxon T),
measures the association between two corrected for multiple comparisons.
categorical variables.

 Is a non-parametric test of independence.


Median Test (281-284) Kruskal-Wallis Test (290-294)

 Non-parametric test used to compare 2  An extension of the Mann-Whitney U-test for


groups. 3 or more groups.
 Primarily used to evaluate the hypothesis that  Might be considered the non-parametric
the two groups are sampled from a population counterpart of the ANOVA.
with the same median.  Primary purpose of using this test is to
 Used to determine whether a common median evaluate 3 or more sampling distributions of
splits each of the two groups into equal ranked data.
subgroups.  Null is that each sample distribution is
represented equally by a single distribution of
ranks.
Measures of Central Tendency – Mean, Median, and Mode
View Video: http://stattrek.com/videos/ap/lessons/statistics/mean-median/video.html

The mean and the median are summary measures used to describe the most "typical" value in a set of
values.
Statisticians refer to the mean and median as measures of central tendency.
The Mean and the Median
The difference between the mean and median can be illustrated with an example. Suppose we draw a
sample of five women and measure their weights. They weigh 100 pounds, 100 pounds, 130 pounds,
140 pounds, and 150 pounds.
 To find the median, we arrange the observations in order from smallest to largest value. If there
is an odd number of observations, the median is the middle value. If there is an even number of
observations, the median is the average of the two middle values. Thus, in the sample of five
women, the median value would be 130 pounds; since 130 pounds is the middle weight.
 The mean of a sample or a population is computed by adding all of the observations and
dividing by the number of observations. Returning to the example of the five women, the mean
weight would equal (100 + 100 + 130 + 140 + 150)/5 = 620/5 = 124 pounds. In the general case,
the mean can be calculated, using one of the following equations:
Population mean = μ = ΣX / N OR Sample mean = x = Σx / n
where ΣX is the sum of all the population observations, N is the number of population observations, Σx
is the sum of all the sample observations, and n is the number of sample observations.
When statisticians talk about the mean of a population, they use the Greek letter μ to refer to the mean
score. When they talk about the mean of a sample, statisticians use the symbol x to refer to the mean
score.
The Mean vs. the Median
As measures of central tendency, the mean and the median each have advantages and disadvantages.
Some pros and cons of each measure are summarized below.
 The median may be a better indicator of the most typical value if a set of scores has an outlier.
An outlier is an extreme value that differs greatly from other values.
 However, when the sample size is large and does not include outliers, the mean score usually
provides a better measure of central tendency.
To illustrate these points, consider the following example. Suppose we examine a sample of 10
households to estimate the typical family income. Nine of the households have incomes between
$20,000 and $100,000; but the tenth household has an annual income of $1,000,000,000. That tenth
household is an outlier. If we choose a measure to estimate the income of a typical household, the mean
will greatly over-estimate the income of a typical family (because of the outlier); while the median will
not.
Effect of Changing Units
Sometimes, researchers change units (minutes to hours, feet to meters, etc.). Here is how measures of
central tendency are affected when we change units.
 If you add a constant to every value, the mean and median increase by the same constant. For
example, suppose you have a set of scores with a mean equal to 5 and a median equal to 6. If you
add 10 to every score, the new mean will be 5 + 10 = 15; and the new median will be 6 + 10 =
16.
 Suppose you multiply every value by a constant. Then, the mean and the median will also be
multiplied by that constant. For example, assume that a set of scores has a mean of 5 and a
median of 6. If you multiply each of these scores by 10, the new mean will be 5 * 10 = 50; and
the new median will be 6 * 10 = 60.

Mode
The mode is the most frequent score in our data set. On a histogram it represents the highest bar in a bar
chart or histogram. You can, therefore, sometimes consider the mode as being the most popular option.
An example of a mode is presented below:
Normally, the mode is used for categorical data where we wish to know which is the most common
category, as illustrated below:

We can see above that the most common form of transport, in this particular data set, is the bus.
However, one of the problems with the mode is that it is not unique, so it leaves us with problems when
we have two or more values that share the highest frequency, such as below:
We are now stuck as to which mode best describes the central tendency of the data. This is particularly
problematic when we have continuous data because we are more likely not to have any one value that is
more frequent than the other. For example, consider measuring 30 peoples' weight (to the nearest 0.1
kg). How likely is it that we will find two or more people with exactly the same weight (e.g., 67.4 kg)?
The answer, is probably very unlikely - many people might be close, but with such a small sample (30
people) and a large range of possible weights, you are unlikely to find two people with exactly the same
weight; that is, to the nearest 0.1 kg. This is why the mode is very rarely used with continuous data.

Another problem with the mode is that it will not provide us with a very good measure of central
tendency when the most common mark is far away from the rest of the data in the data set, as depicted in
the diagram below:
In the above diagram the mode has a value of 2. We can clearly see, however, that the mode is not
representative of the data, which is mostly concentrated around the 20 to 30 value range. To use the
mode to describe the central tendency of this data set would be misleading.

Summary of when to use the mean, median and mode

Please use the following summary table to know what the best measure of central tendency is with
respect to the different types of variable.

Type of Variable Best measure of central tendency


Nominal Mode

Ordinal Median

Interval/Ratio (not skewed) Mean

Interval/Ratio (skewed) Median


Populations and Samples
Watch video: http://stattrek.com/videos/ap/lessons/statistics/populations-and-samples/video.html

The study of statistics revolves around the study of data sets. This lesson describes two important types of data
sets - populations and samples. Along the way, we introduce simple random sampling, the main method used in
this tutorial to select samples.

Population vs Sample

The main difference between a population and sample has to do with how observations are assigned to the data
set.

 A population includes all of the elements from a set of data.

 A sample consists of one or more observations from the population.

Depending on the sampling method, a sample can have fewer observations than the population, the same
number of observations, or more observations. More than one sample can be derived from the same population.

Other differences have to do with nomenclature, notation, and computations. For example,

 A measurable characteristic of a population, such as a mean or standard deviation, is called a parameter;


but a measurable characteristic of a sample is called a statistic.

 We will see in future lessons that the mean of a population is denoted by the symbol μ; but the mean of
sample is denoted by the symbol x.

 We will also learn in future lessons that the formula for the standard deviation of a population is different
from the formula for the standard deviation of a sample.

What is Simple Random Sampling?

A sampling method is a procedure for selecting sample elements from a population. Simple random
sampling refers to a sampling method that has the following properties.

 The population consists of N objects.


 The sample consists of n objects.
 All possible samples of n objects are equally likely to occur.

An important benefit of simple random sampling is that it allows researchers to use statistical methods to analyze
sample results. For example, given a simple random sample, researchers can use statistical methods to define
a confidence interval around a sample mean. Statistical analysis is not appropriate when non-random sampling
methods are used.

There are many ways to obtain a simple random sample. One way would be the lottery method. Each of
the N population members is assigned a unique number. The numbers are placed in a bowl and thoroughly
mixed. Then, a blind-folded researcher selects n numbers. Population members having the selected numbers are
included in the sample.

Random Number Generator

In practice, the lottery method described above can be cumbersome, particularly with large sample sizes. As an
alternative, use Stat Trek's Random Number Generator. With the Random Number Generator, you can select up to
1000 random numbers quickly and easily. This tool is provided at no cost - free!! To access the Random Number
Generator, simply click on the button below. It can also be found under the Stat Tools tab, which appears in the
header of every Stat Trek web page.

Random Number Generator: http://stattrek.com/Tables/Random.aspx

Sampling With Replacement and Without Replacement

Suppose we use the lottery method described above to select a simple random sample. After we pick a number
from the bowl, we can put the number aside or we can put it back into the bowl. If we put the number back in the
bowl, it may be selected more than once; if we put it aside, it can be selected only one time.

When a population element can be selected more than one time, we are sampling with replacement. When a
population element can be selected only one time, we are sampling without replacement .

Test Your Understanding

Problem 1

Which of the following statements are true?

I. The mean of a population is denoted by x.


II. Sample size is never bigger than population size.
III. The population mean is a statistic.

(A) I only.
(B) II only.
(C) III only.
(D) All of the above.
(E) None of the above.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy