Data Analysis Using Spss
Data Analysis Using Spss
USING SPSS
Dr. Mark Williamson, PhD
(based on PDF of Andrew Garth, Sheffield Hallam
University)
Purpose
■ The intent of this presentation is to teach you to explore, analyze, and
understand data
■ The software used is SPSS (Statistical Package for the Social Sciences)
– commonly used in social sciences and health fields
– as opposed to other statistical software such as SAS or R, it requires
little to no coding background
■ This presentation is heavily indebted to the work of Andrew Garth (Sheffield
Hallam University) and his full document can be found at the link below:
https://students.shu.ac.uk/lits/it/documents/pdf/analysing_data_using_sp
ss.pdf
■ All the data files used in this presentation can be found at the link below
(download the SPSSDATA.zip):
http://teaching.shu.ac.uk/hwb/ag/resources/resourceindex.html
Outline
■ First, we will look at the Big Picture
■ Next, we’ll define our terms
■ Then, we’ll get set up for working in SPSS
■ Only then will we get into the meat of things, which will
focus on aspects of data analysis
– Descriptive Statistics and Graphs (Exploring our Data)
– Inferential Statistics (Analyzing our Data, and
Interpreting our Results)
The Big Question.
■ How should I analyze my data?
It depends on the nature of the data and
what questions you want to answer
To answer those questions, you need to explore your data.
and select the proper analysis
Big Picture Steps in Statistical Analysis
1. Explore your data
1. Look at data
2. Identify data
3. Graph/Describe data
4. Formulate Question (Hypothesis)
■ It is good practice to have multiple 3. To save graphs or analyses, we need to do an analysis first
copies of data (especially when 1. Click on the Analyze menu and choose Descriptive Statistics, then
working on original data) Descriptives.
2. The button between the two windows let you choose the variables
to be analyzed, in our case the choice is simple, just click the
Reminder: the data needed for the center button to move the age variable over to the right then click
OK.
tasks to follow are at:
3. SPSS should display the results in a separate window, you will see
https://teaching.shu.ac.uk/hwb/a this appear in front of the Data Editor and a new button will
g/resources/resourceindex.html appear on the Windows task bar at the bottom of your screen. The
new window has a title, have a look in its title bar at the top of its
window.
4. Look at the output. If you want to save results like this, you have
to save it separately.
Starting in SPSS: Looking at data
■ Seeing what data looks like is the first step to data
analysis 1. Open up Studentss in SPSS
■ It gives a broad-overview in what is going on 1. choose the File menu and select Open->
■ Again, each row is a different sample, while the columns Data (will need to search for wherever
show the value of different variables for that sample you downloaded the sample files)
■ Looking at the data tells you a lot of big-picture things 2. Take a look at the data and answer the
– How many samples there are following questions.
– How many variables there are 1. What is each column telling you?
– The types of variables and their values
2. Which group is which?
– If there is any missing data
3. How many students were in each group?
■ We will examine some data collected by an Occupational
Therapy student, looking at how age affected OT 4. Do older students contribute more
students’ participation in discussion in class. frequently in class discussion?
■ She counted how many times each student contributed
orally in a period totaling 12 hours of classes. The
students were from the 1st and 2nd years of the course
and were classed as young if under 21 and mature if 21
or over, making 4 groups altogether.
Starting in SPSS: Exploring the Data
■ When analyzing data, it is necessary 1. Click on the Analyze menu->Descriptive
to know what variable is what Statistics->Explore.
■ Dependent variable: 2. Transfer the speaks variable to the
– depends on the factor Dependent list and the group variable to
the Factor list and then click OK.
– Is usually numerical
– In our case, it is ‘speaks’ 3. Take a look at the results.
Standard deviation
■ What is the Standard Deviation (S.D.) 1. Open the file std dev example in SPSS
really measuring?
2. Use the Descriptive Statistics->Descriptives
■ What can it tell us about our data? to fill out the table below
■ Let’s take a look at some data German Geography IT
MEAN
– The table below shows the
German, Geography and IT MAX
results of a group of ten MIN
students. 3. Which set(s) of figures has the largest
range?
4. Which set(s) of figures has the largest
number in it?
5. Which set(s) of figures contains the smallest
number?
6. Which set of figures has the largest
minimum?
Part A-4
Standard deviation 2
■ Given the figures for mean, maximum and minimum it is hard to differentiate between the German and
IT figures, the mean, (arithmetic mean) of the figures is the numbers all added together then divided by
the number of numbers.
■ However it gives no indication of the distribution of the marks within the sets of figures. To do this we
could graph the three sets of figures and see if that helps us (later we will create bar charts, for now just
look at these).
■ Look at the three graphs above. Which two do you think are most similar?
■ Possibly Geography and IT but it is rather subjective. They do seem to have less variation in the values
than the German results.
Standard deviation 3
■ Question: How can we asses in a ■ Answer: Use the Standard
fair, unambiguous way, which of Deviation.
three has the least widely deviating
set of numbers?
■ The standard deviation of a set of numbers is a measure of how widely values are dispersed
from the mean value. It can be calculated manually, or SPSS can calculate it for you.
Part A-4
Standard deviation 4
■ Let’s work out the standard deviation of the 1. Use Descriptive Statistics then Frequencies from
numbers in each column from the std dev the Analyze menu.
example
2. Select the three variables (get German,
– Higher Standard Deviation values indicate a Geography and Information Technology (IT) from
greater spread of values the left into the right pane).
– Lower Standard Deviation values indicate a
tighter spread of values 3. Click the “Statistics” button and select the
Standard deviation as well as mean, maximum
Summary: Range, IQR & SD are all measures of and minimum, then click “Continue”.
spread. Only the SD takes all the data values into 4. Before pressing OK on the Frequencies dialog
account, however this leaves it open to problems box, uncheck the option to display frequency
similar to the mean, i.e. a tendency to be swayed tables then click OK.
inordinately by extreme values. The range is 5. Compared the standard deviations.
extremely sensitive to outliers, since it is based 1. Which set of figures, German, Geography
only on the smallest and largest values. The Inter or IT, is the least spread out?
Quartile Range is again based on only two values, 2. Of the two subjects with the same mean,
the upper and lower quartiles, these are on each and the same range, which varies least?
end of the middle half of the data, therefore less 3. Which of the three sets of figures, German,
effected by extremes. Geography or IT varies most?
Assessment 3
1. In the data to the right, which subject Exam Scores
had the highest average score? Subject N Mean Standard
Deviation
2. In the data to the right, which subject
Art 10 95 3.3
had the most variation in score?
Spelling 10 70 5.8
Which had the least?
Math 10 67 3.5
3. What are the 4 rules for exploring Science 10 84 12.3
data? Social Studies 10 89 2.1
Physical Education 10 98 1.2
Assessment 3 Answers
1. In the data to the right, which subject
had the highest average score? Exam Scores
Subject N Mean Standard
Physical Education Deviation
2. In the data to the right, which subject Art 10 95 3.3
had the most variation in score? Spelling 10 70 5.8
Which had the least? Math 10 67 3.5
Science, Physical Education Science 10 84 12.3
Social Studies 10 89 2.1
3. What are the 4 rules for exploring
Physical Education 10 98 1.2
data?
1. Look at the Data
2. Describe Each Variable
3. Graph/Stats each Comparison
4. Write Research Question
Graphs
■ Graphs serve two purposes
– Quickly visualize data during data exploration
– Present results of significant statistical analyses
Types of Graphs to be covered
Type of Graph Data Type Usage Basic Example Another Example
Histogram Single numerical variable Data exploration Heights of freshman Tooth number of apex-
(determining normality) students predator dinosaurs
Boxplot Single numerical variable; Data exploration, Heights of freshman Weights of apex-predator
single numerical variable + presenting non-parametric students; Heights of dinosaurs; Weight of apex-
categorical variable t-tests/ANOVA students by grade predator dinosaurs by
geological period
Bar Chart Single numerical variable + Presenting Parametric T- Heights of students by Tooth number of sharks by
categorical variable test/ANOVA results grade species
Scatterplot Two numerical variables Data exploration, Heights and weights of Weights and top swimming
presenting correlation students speed of sharks
results
Line Charts Two numerical variables Data exploration Heart rate over time Ounces of coffee drank by
(one usually time) students over time
Multiple Line Charts Three or more numerical Data exploration Various concentrations of Ounces of various
variable (one usually time, nutrients in bloodstream caffeinated beverage
rest on same scale) over time drank by students over
time
Pie graph Single numerical variable Data exploration Percentage of students Percentage of different
(proportions) + categorical across grades caffeinated beverages
variable drank in a month
Histogram and Normal Distribution
■ Histograms can be used to look at 1. Open the file Reconstructed male heights 1883 in SPSS.
the distribution of data 2. This file contains data that is similar to that from which the
table you have seen was derived. The file contains 8585
heights, measured in inches.
■ This is important for determining if
the data is parametric or not 3. We are going to create a histogram from the values in the
variable called hgtrein
4. From the menus choose Graph->Chartbuilder.
5. A dialog box will come up, choose OK.
6. In the bottom section Choose Histogram and double click the
first image
7. Drag the hgtrein (Heights in inches - reconstructed) variable
Reminder: if data is parametric, it will over to the box representing the horizontal (X) axis of the graph.
approximate a normal distribution (bell 8. Click OK and wait to see the graph in the output viewer. You
curve) when viewed as a histogram. Many should see a normal (bell shaped) pattern to the distribution of
the data.
statistical tests can only be used if the
data is parametric 9. To see a normal curve superimposed on the graph go back to
the Create Histogram dialog box (from the menus Graph,
(Legacy,) Interactive, Histogram) then click on the Histogram
tab and tick the "Normal curve" check box, then Click OK.
10. Are these data Discrete or Continuous?
Histogram 2
Radiologist example:
1. Open Radiologist dose with and without lead combined file in
■ The file Radiologist dose with and without lead SPSS
combined.sav contains data gathered to assess the 2. Look at the data, the variable called "screen" is the variable
effect of a lead screen to reduce the radiation dose to that lets you discriminate between procedures carried out with
Radiologists hands while carrying out procedures on or without the lead screen. If there is a 1 in the screen variable
column it means the procedure was carried out with the screen
patients being irradiated. in place, if not the value is 0.
■ In the trials the lead screen was placed between the 3. We can use this discriminatory variable to create two
patient and the radiologist, the intended effect was to histograms at once, by using it as a panel variable.
reduce the radiation dose to the radiologist, however 4. The variable we are interested in is the dose to the radiologists'
there were fears that working through the screen would left hand, the left-hand would be nearest the patient so we will
lengthen the procedure. We want to answer two concentrate on the left-hand dose variable.
questions with this data, one about the hand dose and 5. Draw histogram using the left-hand dose variable (lhdose)
the other about the length of time the examination
took. 6. Go to the Groups/Point ID tab and click the Rows panel
variable
Summary: Histograms are for displaying continuous data,
e.g. height, age etc, the bars touch, signifying the 7. Drag the discriminatory variable (Lead or No Lead) as the panel
continuous nature of the data. The area of the bars
variable.
represent the number in each range, the bars are usually 8. What do the histograms show us about the data?
of equal widths but this need not always be the case.
Histograms should be clearly labelled and the units of 9. If you have time draw a similar histogram using the extimmin
measure displayed. The use of Histograms compared to
variable. Does this back up the fears about the increase in
examination time?
Bar Charts is summarized after the section on Bar Charts.
Drawing boxplots
1. Go back to in studentsss
Summary: Line graphs are ideal for showing the 7. In the 11 years covered by the data do the numbers of girls and boys
aged 1 to 4 looked after by Local Authorities in England appear to
changes in a variable as another alters, e.g. increase?
changes over time. The independent variable goes
8. Are the number of boys and girls in the age group 1 to 4 staying in
on the x-axis and the dependent variable goes up roughly the same proportion, i.e. do they seem to increase or decrease
the y-axis. More than one line is often shown on together?
the chart allowing comparisons. Line graphs 9. Now plot the data for the 16 and over age group, can you see any
should be clearly labelled and the units of difference between the girls and boys?
measure displayed.
Part A-9
Assessment 4
1. In the boxplot to the right, label the a b
letters with the appropriate term
a)
b)
c)
2. For the three histograms to the
right, label them as parametric
(normally distributed) or non-
parametric
3. For the scatterplots to the right,
label the correlation as:
a) Strong, Weak, or None
b) Positive, Negative, or None
c
Assessment 4 Answers
1. In the boxplot to the right, label the a b
letters with the appropriate term
a) Interquartile Range
b) Median
c) Extreme Value / Outlier
2. For the three histograms to the right,
label them as parametric (normally
distributed) or non-parametric
parametric, non-parametric, non-parametric
Guidelines of tests
■ You ought to be interested in using statistics to make as accurate mathematical
inferences about the complexities of reality to make the world a better place
■ The statistics only tell you as much as you put into them, and again, they are only
mathematical representations
■ It is up to you to be as disciplined as possible in setting up your data and analyzing it
in such a way as to best get at the truth of the world
■ The following are my strong suggestions of how to go about analyzing data: think of
them like football drills: you need to master the basics to be any good at answering
questions with statistics
Part B-2
Rules of Analysis
A. Explore your data (outlined in first section)
1. Look at data
2. Identify data
3. Graph/Describe Data
4. Formulate Question
Y N
Y N
■ 2 numerical variables
– Correlation
■ Parametric: Pearson correlation (usually)
■ Non-parametric: Spearman rank-order correlation
■ 2 categorical variables
– Chi Square test
One Sample T-test 1. Explore your data
1. Easy, since all you have is one variable
■ 1 categorical variable + 1 numerical variable
2. Histogram and maybe boxplot
– Categorical variable is non-paired and
group number is one 2. Check normality
■ This is when you have a single numerical variable 1. Histogram, QQ-plot
you are interested in and want to know if it is 3. Set up hypothesis
different from some value
– Is the average height of basketball players 1. Null: the variable is no different from a certain
greater than 6.2 feet? value
– Is the infant mortality rate in a certain 2. Alternative: it is different
county less than 2 death in 1000? 4. Select and run appropriate test
– Is the effectiveness of treatment of a new 1. Student’s T-test
drug any different from zero?
2. If non-parametric, mumble, mumble Mann-
Whitney
5. Interpret results
1. Null rejected or failed to reject?
2. What does it mean for your question
3. Write it out
Part B-2
4. Write result
• Young women were
significantly taller
(mean=162.5) than the value
of 155 cm (1-sample t-test,
t=7.533, DF=29, p-
value<0.0001).
P-value
Part B-2
T-test: Examples
■ Parametric Example: Women Height 1. Open waheig2S file in SPSS
4. Write result
• Younger women, age range of
20-24, are significantly taller
than older women of an age
range of 50-54 (2-tailed T-test,
F=0.094, DF=58, P-
value=0.016).
T-test: Examples 2
1. Open studentsss file in SPSS
■ Non-Parametric Example: Student
2. Explore data and check normality
Contribution
1. Should not be normally distributed
– The file has all the numbers
representing the number of times each 3. Define null and alternative hypothesis to question.
student contributed in the variable called 1. Fill out yourself
“speakn” and the age group in the
variable called “grp” 4. Run t-test (Mann-Whitney)
– Each row of this data represents a 1. Analyze -> Nonparametric Tests-> Legacy Dialoges->2
independent samples
student, the number in the “speakn”
column is the amount they contributed 2. Speakn goes in the Test Variable
and the number in the “grp” column tells 3. Age goes in the Grouping Variable
us their age and year grouping. 1. Need to define groups (1=Year1 young, 2=Year1 mature)
– The middle column is just some text to 4. Make sure the Mann-Whitney test is ticked (under Test Type)
help you see which group is which, if you
5. Interpret Results
go to variable view you will see the “grp”
variable labels similar to the ones 1. See next page
explained in the previous task 2. What is the test statistic, degrees of freedom, and p-value?
– Question: Do mature first year students 3. Did you reject or fail to reject the null?
contribute more than young first year 4. What does it mean for the question?
students? 5. Are the two groups of students different? If so, how?
1. Find the Test Statistic, DF, and P-
value
• U=23.500
• DF=n/a
• P-value=0.007 (Exact)
2. Determine if significant
• P-value < 0.05
• Significant
4. Write result
• Mature students (group 2)
spoke significantly more than
young students (Mann-Whitney
Test, U=23.500, N=23, P-
value=0.007 with exact
significance).
2. Determine if significant
• P-value < 0.05
• Significant
4. Write result
Test Statistic P-value • Subjects had a tendency to
complete more steps under
group conditions than under
• individual conditions. (Paired
Samples T-test, t=3.503,
DF=11, p-value=0.005).
Degrees of
Freedom 5. Present appropriate plot
• N/A
1. Find the Test Statistic, DF, and P-
value
• Z=-2.631
• DF=n/a
• P-value=0.002
2. Determine if significant
• P-value < 0.05
• Significant
4. Write result
• Subjects had a tendency to
complete more steps under
group conditions than under
• individual conditions. (2-tailed
Wilcoxon signed ranks test,
Z=-2.631, n=24, p = 0.009).
Correlation: Examples
1. Open Heathip file in SPSS
■ Parametric Example: Women Height 2. Explore data and check normality
– file contains data from a student project 1. Determine which is dependent and which is independent
on the effect of heat on hip stretches. 2. Whether it is normally distributed or not
3. Plot scatterplot (height on x-axis and stretch (without heat) on y-axis):
– The first column gives the subject’s Graphs->Interactive Scatterplot
height, and the second column gives 54
3. Define null and alternative hypothesis to question.
the increase in hip extension after
stretching exercises. 4. Run Appropriate test (Try both)
1. Parametric
– (Other columns relate to the discomfort 1. Analyze->Correlate->Bivariate
experienced, and the stretch and 2. Height and stretch go in Variable
discomfort when heat is used; for our 3. Make sure Pearson is checked under Correlation Coefficients
purposes those are nuisance variables) 4. Also check that Two-Tailed is set up and Flag significant correlations
– This is paired data (measurements 2. Non-Parametric: same thing but check “Spearman” instead of Pearson
2. Determine if significant
• P-value > 0.05
• Not Significant
4. Write result
• There was no correlation
between Height and Stretch
Increase in subjects.
Correlation Notes
■ Looking for correlation is different from looking for increases or decreases
■ Correlation does not necessarily mean a causal relationship. Just because two
values appear to go up and down together does not mean one is causing the other.
■ The Pearson’s coefficient is designed primarily for looking at linear relationships.
Two variables can be related, but if the relationship is not linear, Pearson’s
correlation coefficient is not an appropriate statistic for measuring their association.
■ The number of observations as with other statistics effects the significance.
P-values a summary
– Question: Does each school in the SHU 2. What is the test statistic, degrees of freedom, and p-value?
have male/female ratio that reflect the 3. Did you reject or fail to reject the null?
overall ratio? 4. What does it mean for the question?
1. Find the Test Statistic, DF, and P-
value
• Chi-Square=635.561
• DF=8
• P-value<0.0001
2. Determine if significant
• P-value < 0.05
• Significant
4. Write result
• There is a significant
difference in the
representation of the sexes
across the schools (2-tailed
chi square test, chi-
sq=635.561, df=8, p-
value<0.0001).
– Question: Is there a difference in scores 5. What does it mean for the question?
between the three methods?
1. Find the Test Statistic, DF, and P-value
• F=6.053
• DF=23
• P-value=0.008
2. Determine if significant
• P-value < 0.05
• Significant
4. Write result
• There was a significant difference in
teaching methods (1-way ANOVA,
F=6.052, DF=23, p-value=0.008). Method
3 had the highest exam scores
– The data are really three different sets of 1. Normality: Analyze->Descriptive Statistics->Explorer
scores, one set for each group, so when 2. Put Score in Dependent list box, then click on the Plots button
we test them for normality, we need to 3. Click to select Normality plots with tests (if p-value below 0.05 in any of the groups,
then go non-parametric)
remember this, if we treat them as one 4. Pretend that it was the case and try non-parametric (just less power)
group then any differences between the
3. Define null and alternative hypothesis to question (SAME AS BEFORE)
groups might lead us to thing that the
data aren’t normally distributed when 4. Run Appropriate test
1. Analyze->Non-Parametric Tests->Legacy Dialogs -> K Independent Samples
the data from each group is
2. Score in Test Variable List
– It is the normality of each group that 3. Method in Grouping Variable (define groups to 3 using the Define Range button)
matters 5. Interpret Results
– Question: Is there a difference in scores 1. See next page
between the three methods? 2. What is the test statistic, degrees of freedom, and p-value?
3. Did you reject or fail to reject the null?
4. If reject, null, run post hoc and determine the difference.
1. Go back to One Way ANOVA dialog box
2. Choose Post Hoc -> Tukey
2. Determine if significant
• P-value < 0.05
• Significant
4. Write result
• There was a significant difference in
■ Notice that the nonparametric test still says that there is a significant teaching methods (Kruskal Wallis
difference between the groups (p=0.018) however it isn't quite as Test, Chi-Square=8.077, DF=2, p-
well convinced as the more sensitive ANOVA. This is a good value=0.018). Method 3 had the
illustration of the minor penalty that you pay for the more rugged 75 highest exam scores.
nonparametric tests, they are less likely to catch a small effect that
5. Present appropriate plot
does exist, i.e. they are less powerful. • Boxplot
■ Run Post-Hoc test like before (Tukey)
■ So to recap; generally scores would be better treated by
nonparametric methods. In this example we did find them to be
normally distributed and used them as an example in applying a one
way ANOVA and its nonparametric equivalent, the Kruskal-Wallis test.
Finally, the two tests agreed but we noticed a slight difference in how
certain they were.
Part B-2
Repeated Measures ANOVA
1. Explore your data
1. Histogram of numerical variable
■ 1 categorical variable + 1 numerical variable
2. Boxplot of numerical variable grouped by the
– Categorical variable is paired and group categorical variable
number is greater than two
2. Check normality
■ This is when you have a single numerical variable
you are interested in and want to know if it is 1. Histogram, QQ-plot, Test for Normality
different between multiple groups that are paired, 2. Do these by group
usually the same subject
– Is there a difference in grade point average 3. Set up hypothesis
between students their Freshman, 1. Null: there is no difference between groups
Sophomore, Junior, or Senior year? 2. Alternative: there is a difference
– Is there a difference in soil moisture retention
1, 2, 3, 4, 5, or 6 years post treatment? 4. Select and run appropriate test
– Is there a difference in heartrate after 1, 2, 1. Parametric: Repeated Measures ANOVA
and 3 cups of coffee? 2. Non-parametric: Friedman Test
■ Repeated Measures ANOVA like extension of the 5. Interpret results
paired t-test, like 1-way ANOVA like extension of the
independent samples t-test 1. Null rejected or failed to reject?
2. Post Hoc test
3. What does it mean for your question
4. Write it out
Part B-2
an experiment where subjects jumped 2. Put the three variables containing the energies (jump 1-3) in Dependent Box
three times. 3. Click to select Normality plots with tests (if p-value below 0.05 in any of the groups,
then go non-parametric)
– Each subject jumped three times, the 3. Define null and alternative hypothesis to question
height was recorded, the column 1. Fill out yourself
labelled Jump1 has each subjects first 4. Run Appropriate test
jump in it, the column labelled Jump2 1. Analyze->General Linear Model->Repeated Measures
has each subjects second jump in it and 2. Repeated Measure; define factors dialog should appear -> put 3 in number of
so on. levels, as there were three jumps, then click Add button
3. Click define, highlight the three jump variables and send them into the box with the
– Question: Is there a difference in energy question marks in and click OK
between the three jumps? 5. Interpret Results
1. See next page
2. What is the test statistic, degrees of freedom, and p-value?
3. Did you reject or fail to reject the null?
4. If reject, null, run post hoc and determine the difference.
1. Analyze->Compare Means->Paired-Samples T Test (no more than three levels)
2. Determine if significant
• P-value < 0.05
• Significant
4. Write result
• There was a significant in
energy used between the three
jumps (Repeated Measures
ANOVA, Spericity Assumed,
F=7.233, DF=2, p-value=0.002).
conditions; no crutches, elbow crutches 2. Put the three variables containing the energies in Dependent Box
and axillary crutches 3. Click to select Normality plots with tests (if p-value below 0.05 in any of the groups,
then go non-parametric)
– the energy used was measured 3. Define null and alternative hypothesis to question
(indirectly by looking at the oxygen used) 1. Fill out yourself
used between the three methods? 2. What is the test statistic, degrees of freedom, and p-value?
3. Did you reject or fail to reject the null?
4. If reject, null, run post hoc and determine the difference.
1. NONE FOR NOW
2. Determine if significant
• P-value > 0.05
• Not Significant
4. Write result
• There was no significant
difference in energy used
between the three jumps
walking measures.
Mixed Designs
Reliability: Examples
■ Imagine that a student wants to find out if a
1. Open ICC and Cronbachs alphs file in SPSS
certain exercise can improve performance. 2. Calculate the ICC
■ To measure performance they decide to use a 1. Analyze->Scale->Reliability Analysis
simple measured jump. However to be sure
that he can sensibly repeat the measures 2. Put Jump 1 and 2 into the Items box and click
after the exercise regime has been completed statistics
he wants to estimate the reliability of his
measurement method. 3. Tick the Intraclass Correlation Coefficient
■ To get round the problems of (XYZ), use 3. Calculate the Alpha
Intraclass Correlation Coefficient
1. Done at the same time by the analysis
■ The coefficient will tell us how much
agreement the two measurements have
■ Can also use Cronbach’s Alpha, another
measure of reliability (Note that a reliability
coefficient of .70 or higher is considered
"acceptable" in most Social Science research
situations using Cronbach's Alpha)
■ Alpha also works for more than 2 measures
1. Find the Test Statistic, DF, and P-
value
• Coefficient=55.606
• DF1=5
• DF2=5
■ The Intraclass Correlation Coefficient (ICC) in this case is • P-value<0.0001
0.962 we use the single measures because the figures we
2. Determine if significant
fed SPSS were raw measurements not an average of • P-value < 0.05
several attempts. This value, 0.962 shows a considerable • Significant
amount of agreement!
3. State if null rejected or not
• Reject the Null
4. Write result
• There was a significant
correlation between jump
measurements (ICC=55.606,
DF=5,5, p-value<0.0001
2. Determine if significant
• P-value < 0.05
• Significant
4. Write result
• There was moderate agreement
between raters (Kappa=0.473,
N=85, p-value<0.0001).