100% found this document useful (1 vote)
560 views56 pages

Psychological Assessment - Reliability & Validity

Reliability refers to the consistency of test scores. Early studies by Spearman found reliability could be estimated using test-retest, parallel forms, and split-half methods. Sources of measurement error include random error from factors like time sampling and systematic error. Reliability is calculated using formulas like Pearson's correlation coefficient, Spearman-Brown, and KR20 for dichotomous items. High reliability is important but must be balanced with other considerations.

Uploaded by

Aly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
560 views56 pages

Psychological Assessment - Reliability & Validity

Reliability refers to the consistency of test scores. Early studies by Spearman found reliability could be estimated using test-retest, parallel forms, and split-half methods. Sources of measurement error include random error from factors like time sampling and systematic error. Reliability is calculated using formulas like Pearson's correlation coefficient, Spearman-Brown, and KR20 for dichotomous items. High reliability is important but must be balanced with other considerations.

Uploaded by

Aly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 56

Psychological Testing

& Assessment
Chapter 4

RELIABILITY
Things to be discussed:

Define what reliability is Discuss Spearman’s early studies

Describe the conceptualization of Know the different sources of


error error
Things to be discussed:
Discuss the reliability in behavioral
studies

Explain the different methods of


estimating reliability
Answer the question “how
reliable is reliable?”

Know how to calculate the


Discuss what to do about
reliability of a test
low reliability
What is reliability?

Reliability

It is the degree to
which a measurement
produces consistent
results.
Reliability coefficient
a measure of the proportion that indicates the ratio between
the true score variance and the observed score variance.
Spearman’s Early Studies

Charles Spearman
(1863 – 1945)
Conceptualization of Error

Measurement error
refers to all the factors associated with the process of measuring some
variable, other than the variable being measured.
Conceptualization of Error

Random error Systematic error


An error caused by unpredictable An error in measuring a variable
fluctuations of other variables in that is typically constant to what
the measurement process. is presumed to be the true value
of the variable being measured.
Sources of
Error Time Sampling

Item Sampling

Test Administration
Carryover effect
• performance in one condition is
Time affected by the condition that
precedes it
Sampling
Practice effect
• improvement due to repeated
practice
Item Sampling Variation among items within a
Item Sampling test as well as to variation
among items between tests.
Item Sampling
Test Administration

Room temperature, level of


lighting, the ventilation, and
surrounding noise, etc.

• Test-taker variables
• Examiner-related variables
Methods of Estimating Reliability

Test-Retest Method

Parallel Forms Method

Split-Half Method
Test-Retest Method

Test-Retest Method

an estimate of reliability
obtained by correlating
pairs of scores from the
same people on two
different administrations
of the same test.
Parallel Forms Method

Parallel Forms Method

an estimate of reliability
that compares two
equivalent forms of a test
that measure
the same attribute.
Split-Half Method

Split-Half Method

an estimate of reliability
obtained by correlating two
pairs of scores obtained odd-even system
from equivalent halves of a
single test administered
once.
Calculating
the reliability of a test

Correlation coefficient

• a mathematical index that


describes the direction and
magnitude of a relationship.
Pearson product moment
correlation
 
r
bbb

Test-Retest Method

Parallel Forms Method


 
r

Where:

n = number of pairs of scores


∑xy = sum of the products of paired scores
∑x = sum of x scores (scores on the 1st ad)
∑y = sum of y scores (scores on the 2nd ad)
∑x 2 = sum of squared x scores
∑y2 = sum of squared y scores
X Y x2 y2 xy
(1 administration)
st
(2 administration)
nd

43 38 1,849 1,444 1,634


32 41 1,024 1,681 1,312
41 44 1,681 1,936 1,804
46 45 2,116 2,025 2,070
45 40 2,025 1,600 1,800
32 36 1,024 1,296 1,152
48 48 2,304 2,304 2,304
36 34 1,296 1,156 1,224
40 39 1,600 1,521 1,560
50 49 2,500 2,401 2,450
∑x = 413 ∑y = 414 ∑x2 = 17,419 ∑y2 = 17,364 ∑xy = 17,310

N = 10
∑x = 413 ∑y = 414 ∑x2 = 17,419 ∑y2 = 17,364 ∑xy = 17,310

  r

  r

  r

  r

  r
  r   r
Spearman-Brown Formula

Split-Half Method

Where:

rSB = reliability of the entire test


rhh = correlation between two halves
Student Total Score Odd (x) Even (y)
1 42 20 22
2 33 15 18
3 44 21 23
4 45 24 21
5 30 16 14
6 26 11 15
7 45 24 21
8 35 18 17
9 40 23 17
10 32 16 16
11 34 15 19
12 45 22 23

N = 12
x = 225 y = 226 x= 4,413 y = 4,364 xy = 4,334

  r

  r

  r

  r



rSB
 

rSB
 
rSB = reliability of the entire test
rhh = correlation between two halves
rSB
 

rSB
 
KR20 Formula
Split-Half Method

 Where:

KR20 = the reliability estimate (r)


N = the number of items on a test
S 2 = the variance of the total score
p = the prop of the people getting each item correct
q = the prop of the people getting each item incorrect
pq = the sum of products of p and q
for each item on a test
Polychotomous
has more than 2 possible outcomes
(Spearman-Brown Formula, Coefficient Alpha)

Dichotomous
has only 2 possible outcomes (KR20 Formula)
Math Problems
1. 5+3 2. 7+2 3. 9+1 4. 6+3 5. 8+6 6. 7+5 7. 4+7 8. 9+2 9. 8+4 10. 5+6
1 1 1 1 1 1 1 1 1 1 1
2 1 0 0 1 0 0 1 1 0 1
3 1 0 1 0 0 1 1 1 1 0
4 1 0 1 1 1 0 0 1 0 0
5 0 0 0 0 0 1 1 0 1 1
6 0 1 1 1 1 1 1 1 1 1
7 0 1 1 1 1 1 1 1 1 1
8 0 0 1 1 0 1 1 0 1 0
9 0 1 1 1 1 1 1 1 1 1
10 0 0 1 1 0 1 0 1 1 1
11 0 0 1 1 0 0 0 0 0 1
12 1 1 0 0 0 1 0 0 1 1
13 1 1 1 1 1 1 1 1 1 1
14 0 1 1 1 0 0 0 0 1 0
15 0 1 1 1 1 1 1 1 1 1
Math Problems
1. 5+3 2. 7+2 3. 9+1 4. 6+3 5. 8+6 6. 7+5 7. 4+7 8. 9+2 9. 8+4 10. 5+6
1 1 1 1 1 1 1 1 1 1 1
2 1 0 0 1 0 0 1 1 0 1
3 1 0 1 0 0 1 1 1 1 0
4 1 0 1 1 1 0 0 1 0 0
5 0 0 0 0 0 1 1 0 1 1
6 0 1 1 1 1 1 1 1 1 1
7 0 1 1 1 1 1 1 1 1 1
8 0 0 1 1 0 1 1 0 1 0
9 0 1 1 1 1 1 1 1 1 1
10 0 0 1 1 0 1 0 1 1 1
11 0 0 1 1 0 0 0 0 0 1
12 1 1 0 0 0 1 0 0 1 1
13 1 1 1 1 1 1 1 1 1 1
14 0 1 1 1 0 0 0 0 1 0
15 0 1 1 1 1 1 1 1 1 1
# of 1s 6 8 12 12 7 11 10 10 12 11
Pro. 0.40 0.53 0.80 0.80 0.47 0.73 0.67 0.67 0.80 0.73
Math Problems
1. 5+3 2. 7+2 3. 9+1 4. 6+3 5. 8+6 6. 7+5 7. 4+7 8. 9+2 9. 8+4 10. 5+6
1 1 1 1 1 1 1 1 1 1 1
2 1 0 0 1 0 0 1 1 0 1
3 1 0 1 0 0 1 1 1 1 0
4 1 0 1 1 1 0 0 1 0 0
5 0 0 0 0 0 1 1 0 1 1
6 0 1 1 1 1 1 1 1 1 1
7 0 1 1 1 1 1 1 1 1 1
8 0 0 1 1 0 1 1 0 1 0
9 0 1 1 1 1 1 1 1 1 1
10 0 0 1 1 0 1 0 1 1 1
11 0 0 1 1 0 0 0 0 0 1
12 1 1 0 0 0 1 0 0 1 1
13 1 1 1 1 1 1 1 1 1 1
14 0 1 1 1 0 0 0 0 1 0
15 0 1 1 1 1 1 1 1 1 1
# of 0s 9 7 3 3 8 4 5 5 3 4
Pro. 0.60 0.47 0.20 0.20 0.53 0.27 0.33 0.33 0.20 0.27
Math Problems
1. 5+3 2. 7+2 3. 9+1 4. 6+3 5. 8+6 6. 7+5 7. 4+7 8. 9+2 9. 8+4 10. 5+6
1 1 1 1 1 1 1 1 1 1 1
2 1 0 0 1 0 0 1 1 0 1
3 1 0 1 0 0 1 1 1 1 0
4 1 0 1 1 1 0 0 1 0 0
5 0 0 0 0 0 1 1 0 1 1
6 0 1 1 1 1 1 1 1 1 1
7 0 1 1 1 1 1 1 1 1 1
8 0 0 1 1 0 1 1 0 1 0
9 0 1 1 1 1 1 1 1 1 1
10 0 0 1 1 0 1 0 1 1 1
11 0 0 1 1 0 0 0 0 0 1
12 1 1 0 0 0 1 0 0 1 1
13 1 1 1 1 1 1 1 1 1 1
14 0 1 1 1 0 0 0 0 1 0
15 0 1 1 1 1 1 1 1 1 1
pxq 0.24 0.25 0.16 0.16 0.25 0.20 0.22 0.22 0.16 0.20
pq = 2.05
Math Problems
1. 2. 7+2 3. 4. 5. 6. 7. 4+7 8. 9. 8+4 10. Total
5+3 9+1 6+3 8+6 7+5 9+2 5+6 Score
1 1 1 1 1 1 1 1 1 1 1 10
2 1 0 0 1 0 0 1 1 0 1 5
3 1 0 1 0 0 1 1 1 1 0 6
4 1 0 1 1 1 0 0 1 0 0 5
5 0 0 0 0 0 1 1 0 1 1 4
6 0 1 1 1 1 1 1 1 1 1 9
7 0 1 1 1 1 1 1 1 1 1 9
8 0 0 1 1 0 1 1 0 1 0 5
9 0 1 1 1 1 1 1 1 1 1 9
10 0 0 1 1 0 1 0 1 1 1 6
11 0 0 1 1 0 0 0 0 0 1 3
12 1 1 0 0 0 1 0 0 1 1 5
13 1 1 1 1 1 1 1 1 1 1 10
14 0 1 1 1 0 0 0 0 1 0 4
15 0 1 1 1 1 1 1 1 1 1 9
= 99
Ẋ = 9.9
Total
Score
10
5
6
5
4
CALCULATE THE 9

VARIANCE USING
9
5 S 2 = 5.57
EXCEL: 9
6
3
5
10
4
9
  KR20

  KR20

  KR20

  KR20

 
KR20
Coefficient Alpha
Split-Half Method

 Where:

r = the reliability estimate


N = the number of items on a test
S 2 = the variance of the total score
S 2i = the variance of the individual
items
Students Item 1 Item 2 Item 3 Total
1 6 6 8 20
2 5 5 6 16
3 9 8 6 23
4 3 2 4 9
5 2 3 2 7
6 1 1 2 4
7 5 4 6 15
Variance: S 2i = 7.29 S 2i = 5.81 S 2i = 5.14 S 2 = 48.95

 S 2 = 7.29 + 5.18 + 5.14 = 18.24


i
S 2 = 48.95

  S 2i = 18.24

  r

  r

  r

  r   r
RELIABILITY
IN BEHAVIORAL OBSERVATION
STUDIES

inter-scorer reliability
Source of error: the degree of agreement or consistency
Observer differences between two or more observers regarding a
particular measure

Kappa statistics
How reliable is reliable?

reliability estimates in the range of .70 and .80 are


good enough for most purposes in basic research

in clinical settings, a test with a reliability of .90


might not be good enough, evaluators should
attempt to find a test with a reliability greater than .
95
What to do about low reliability?

Increase the number of items

Factor analysis
Chapter 5

VALIDITY
Things to be discussed:

Define what validity is Explain the 3 types of


evidences

Discuss the booklet Discuss the aspects of


published by a joint validity
committee
Things to be discussed:

Discuss validity coefficient Discuss the relationship


between reliability and
validity
What is validity?

Validity

the agreement between


a test score or measure
and the quality it is
believed to measure
American Educational Research Association
(AERA), American Psychological Association
(APA), and the National Council on
Measurement in Education (NCME)

Standards for Educational and


Psychological Testing
3 TYPES OF
EVIDENCES
Content-Related Evidence for Validity

Criterion-Related Evidence for Validity

Construct-Related Evidence for


Validity
Aspects of validity

Content-Related Evidence for Validity

• considers the adequacy of representation of


the conceptual domain the test is designed to
cover
Two
concepts:
Construct
underrepresentation
• describes the failure to capture important
components of a construct

Construct-irrelevant
variance
• occurs when scores are influenced by factors
irrelevant to the construct
Aspects of validity

Criterion-Related Evidence for Validity

• tells us just how well a test corresponds


with a particular criterion
Predictive validity
• is an index of the degree to which a test score predicts
some criterion measure

Concurrent validity
• is an index of the degree to which a test score is related to
some criterion measure obtained at the same time
(concurrently).
Validity coefficient
is a statistical index used to report evidence of
validity for intended interpretations of test scores
Aspects of validity

Construct-Related Evidence for


Validity
• demonstrates whether a test measures its
intended construct
Donald Campbell & John Fiske

Convergent evidence
• It is obtained when a measure correlates well with
other tests that are believed to measure the same
construct

Discriminant evidence
• It is obtained when a measure does not correlate
well with other tests that measures other construct
Relationship between
Reliability and Validity
• Reliability and validity are related concepts.
• Although different, they work together
fin

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy