0% found this document useful (0 votes)

16 views87 pages

Al2 Report

The document discusses the concepts of validity and reliability in test construction, detailing various types of validity (content, criterion-related, and construct) and methods to establish reliability (test-retest, equivalent forms, split-half, and Kuder-Richardson formulas). It emphasizes the importance of validity coefficients and reliability coefficients in assessing test quality, alongside factors affecting both validity and reliability. The document also includes examples and calculations to illustrate how to determine the validity and reliability of tests.

Uploaded by

danielkon25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views87 pages

Al2 Report

Uploaded by

danielkon25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 87

ESTABLISHING

VALIDITY AND
RELIABILTY OF
TESTS
Table of contents
 Validity of a Test
 Types of Validity
 Important Things to Remember
about Validity
 Factors Affecting the Validity of a
Test Item
 Reasons that Reduce the Validity of
the Test Item
 Guide Questions to Improve Validity
 Validity Coefficient
 Reliability of a Test
 Factors Affecting Reliability of a Test
 Four Methods of Establishing of a
Test
 Reliability Coefficient
 Description of Reliability Coefficient
 Interpreting Reliability Coefficient
Learning Outcomes
At the end of this chapter, the students should
be able to:
1. define the following terms: validity,
reliability, content validity, construct validity,
criterion-related validity, predictive validity,
concurrent validity, test-retest method,
equivalent/parallel method, split-half method,
Kuder-Richardson formula, validity coefficient,
reliability coefficient;
2. discuss the different approaches of validity;
3. present and discuss the different methods
of solving the reliability of a tests;
4. identify the different factors affecting the
validity of the test;
5. identify the factors affecting the reliability
of the test;
6. compute the validity coefficient and
reliability coefficient; and
7. interpret the reliability coefficient and
validity coefficient of the test.
INTRODUCTION
Test constructors believed that every
assessment tool should possess good qualities.
Most literatures consider the most common
technical concepts in assessment are the
validity and reliability. For any type of
assessment whether traditional or authentic it
should be carefully developed so that it may
serve whatever purpose it may have. In this
chapter, we shall discuss the different ways of
establishing validity and establishing reliability.
VALIDITY OF A
TEST
Validity (Airasian, 2000) is concerned
whether the information obtained from
an assessment permits the teacher to
make a correct decision about a
student's learning.
TYPES OF
VALIDITY
1. CONTENT
VALIDITY
A type of validation that
refers to the relationship
between a test and the
instructional objectives,
establishes content so that
the test measures what it is
supposed to measure.
Things to remember about
validity:
a. The evidence of the content validity of
a test is found in the Table of
Specification.
b. This is the most important type of
validity for a classroom teacher.
C. There is no coefficient for content
validity. It is determined by experts
judgmentally, not empirically.
2. CRITERION-RELATED
VALIDITY
A type of validation that refers to the
extent to which scores from a test
relate to theoretically similar measures.
It is a measure of how accurately a
student's current test score can be
used to estimate a score on a criterion
measure.
a. Concurrent validity. The criterion
and the predictor data are collected at
the same time. This type of validity is
appropriate for tests designed to assess
a student's current criterion status or
when you want to diagnose student's
status; it is a good diagnostic screening
test.
b. Predictive validity. A type of
validation that refers to a measure of
the extent to which a student's current
test result can be used to estimate
accurately the outcome of the student's
performance at later time. It is
appropriate for tests designed to assess
student's future status on a criterion
Predictive validity is very important in psychological testing, like if the
psychologists want to predict responses, behaviors, outcomes,
performances and others.
3. CONSTRUCT VALIDITY
A type of validation that refers
to the measure of the extent to
which a test measures a
theoretical and unobservable
variable qualities such as
intelligence, math
achievement, performance.
TEST OR MEASUREMENT
INSTRUMENT

a. Convergent validity is a type of

construct validation wherein a test has a
high correlation with another test that
measures the same construct.
b. Divergent validity is a type of
construct validation wherein a test has
low correlation with a test that
measures a different construct. In this
case, a high validity occurs only when
there is a low correlation coef-ficient
between the tests that measure
different traits.
C. Factor analysis is another method
of assessing the construct validity of a
test using complex statistical
procedures conducted with different
procedures.

There are other ways of assessing construct validity

like test’s internal consistency, development change
and experimental intervention.
IMPORTANT
THINGS TO
REMEMBER
ABOUT
VALIDITY
1. Validity refers to the decisions we make, and
not to the test itself or to the measurement.
2. Like reliability, validity is not an all-or-nothing
concept; it is never totally absent or absolutely
perfect.
3. A validity estimate, called a validity
coefficient, refers to specific type of validity. It
ranges between 0 and 1.
4. Validity can never be finally determined; it is
specific to each administration of the test.
FACTORS
AFFECTING THE
VALIDITY OF A
TEST ITEM
1.The test itself.
2. The administration and scoring of a
test.
3. Personal factors influencing how
students response to the test.
4. Validity is always specific to a
particular group.
REASONS THAT
REDUCE VALIDITY
OF A TEST ITEM
1. Poorly constructed test items
2. Unclear directions
3. Ambiguous test items
4. Too difficult vocabulary
5. Complicated syntax
6. Inadequate time limit
7. Inappropriate level of difficulty
8. Unlimited clues
9. Improper arrangement of test items
GUIDE
QUESTIONS TO
IMPROVE
VALIDITY
1. What is the purpose of the test?
2. How well do the instructional objectives
selected for the test represent the instructional
goals?
3.Which test item format will best measure the
achievement of each objective?
4. How many test items will be required to
measure the performance adequately to each
objective?
VALIDITY
COEFFICIENT
The validity coefficient is the computed
value of the rxy. In theory, the validity
coefficient has values like the correlation
that ranges from 0 to 1. In practice, most of
the validity scores are usually small and
they range from 0.3 to 0.5, few exceed 0.6
to 0.7. Hence, there is a lot of improvement
in most of our psychological measurement.
Another way of interpreting the findings
is to consider the squared correlation
coefficient
2 (rxy) ,this is called coefficient
of determination. Coefficient of
determination indicates how much
variation in the criterion can be
accounted for by the predictor (teacher
test).
Example, if the computed value of ray=
0.75. The coefficient of determination is
= 0.5625 or 56.25% of the variance in
the student performance can be
attributed to the test or 43.75% of the
student performance cannot be
attributed to the test results.
Example:
Teacher Benjamin James develops a
45-item test and he wants to
determine if his test is valid. He takes
another test that is already
acknowledged for its validity and
uses it as criterion. He conducted
these two sets of test to his 15
students. The following table shows
the results of the two tests. Is the
test developed by Mr. Benjamin
James valid? Find the Validity
coefficient using Pearson r and the
coefficient of determination.
INTERPRETATION:
The correlation coefficient is 0.94,which means that the
validity of the test is high, or 88.36% of the variance in the
students’ performance can be attributed to the test.
RELIABILITY OF A
TEST
Reliability refers to the consistency with
which it yields the same rank for individuals
who take the test more than once (Kubiszyn
and Borich, 2007).

The reliability of a test can be determined

by means of Pearson product correlation
coefficient, Spearman-Brown formula and
Kuder-Richardson formulas.
FACTORS AFFECTING
RELIABILITY OF A TEST
1. Length of the test
2. Moderate item difficulty
3. Objectives scoring
4. Heterogeneity of the student
group
5. Limited time
FOUR METHODS
OF ESTABLISHING
RELIABILITY OF A
TEST
1. TEST-RETEST METHOD
A type of reliability determined by administering the
same test twice to the same group of students with any
time interval between the tests. The results of the test
scores are correlated using the Pearson product
correlation coefficient (r) and this correlation coefficient
provides a measure of stability. This indicates how
stable the test result over a period of time. The formula
is
2. EQUIVALENT FORM
A type of reliability determined by administering two
different but equivalent forms of the test (also called
parallel or alternate forms)to the same group of students
in close succession. The equivalent forms are
constructed to the same set of specifications that is
similar in content, type of items and difficulty. The
results of the test scores are correlated using the
Pearson product correlation coefficient (r) and this
correlation coefficient provides a measure of the degree
to which generalization about the performance of
students from one assessment to another assessment is
justified. It measures the equivalence of the tests.
3. SPLIT-HALF
METHOD
Administer test once and score two equivalent halves of
the test. To split the test into halves that are equivalent,
the usual procedure is to score the even-numbered and
the odd-numbered test item separately. This provides
two scores for each student. The results of the test
scores are correlated using the Spearman-Brown formula
and this correlation coefficient provides a measure of
internal consistency. It indicates the degree to which
consistent results are obtained from two halves of the
test. The formula is
4. KUDER-RICHARDSON
FORMULA
Administer the test once. Score the total test and apply the Kuder-
Richardson formula. The Kuder-Richardson 20 formula is applicable
only in situations where students' responses are scored
dichotomously, and therefore, is most useful with traditional test
items that are scored as right or wrong, true or false, and yes or no
type. KR-20 formula estimates of reliability provide information
whether the degree to which the items in the test measure is of the
same characteristic, it is an assumption that all items are of equal in
difficulty. (A statistical procedure used to estimate coefficient alpha, a
correlation coefficient is given.) Another formula for testing the
internal consistency of a test is the KR-21 formula, which is not
limited to test items that are scored dichotomously.
RELIABILITY
COEFFICIENT
Reliability coefficient is a
measure of the amount of
error associated with the
test scores.
DESCRIPTION OF RELIABILITY
COEFFICIENT
a. The range of the reliability coefficient is
from 0 to 1.0.
b. The acceptable range value 0.60 or higher.
c. The higher the value of the reliability
coefficient the more reliable the overall test
scores.
d. Higher reliability indicates that the test
items measures the same thing,
1. PEARSON PRODUCT MOMENT
CORRELATION COEFFICIENT (rxy)
2. SPEARMAN-BROWN
FORMULA
3. KR-20 AND KR-21
FORMULAS
INTERPRETING
RELIABILITY
COEFFICIENT
1. The group variability will affect the size of
the reliability coefficient. Higher coefficient
results from heterogeneous groups than from
the homogeneous groups. As group variability
increases, reliability goes up.

2. Scoring reliability limits test score

reliability. If tests are scored unreliable, error
is introduced. This will limit the reliability of
the test scores.
3. Test length affects test score
reliability. As the length increases, the
test's reliability tends to go up.

4. Item difficulty affects test score

reliability. As test items become very
easy or very hard, the test's reliability
goes down.
EXAMPLE 1
Prof. Henry Joel conducted a test to
his 10 students in Elementary
Statistics class twice after one-day
interval. The test given after one day
is exactly the same test given the
first time. Scores below were
gathered in the first test (FT) and
second test (ST). Using test-retest
method, is the test reliable? Show the
complete solution.”
Analysis:
The reliability coefficient using the Pearson
r = 0.91 means that it has a very high
reliability. The scores of the 10 students
conducted twice with one-day interval are
consistent. Hence, the test has a very high
reliability.

Note: Compute the reliability coefficient of

the same date using Spearman rho formula.
Is the test reliable?
EXAMPLE 2
Prof. Vinci Glenn conducted a test to his 10
students in his Biology class two times
after one-week interval. The test given
after one week is the parallel form of the
test during the first time the test was
conducted. Scores below were gathered in
the first test (FT) and second test or
parallel test (PT). Using equivalent or
parallel form method, is the test reliable?
Show the complete solution, using the
Pearson r formula.”
Student FT PT
1 12 20
2 20 22
3 19 23
4 17 30
5 25 25
6 22 20
7 15 19
8 16 18
9 23 25
10 21 24
Analysis:
The reliability coefficient using the Pearson
r = 0.76 means that it has a high reliability.
The scores of the 10 students conducted
twice with one-week interval are consistent.
Hence, the test has a high reliability.

Note: Compute the reliability coefficient of

the same date using Spearman rho formula.
Is the test reliable?
EXAMPLE 3

Prof. Glenn Lord conducted a test to his 10

students in his Chemistry class. The test
was given only once. The scores of the
students in odd and even items below were
gathered, (O) odd items and (E) even items.
Using split-half method, is the test reliable?
Show the complete solution.
ANALYSIS

The reliability coefficient using Brown

formula is 0.50, which is questionable
reliability. Hence, the test items should be
revised.
EXAMPLE 4

Ms. Gauat administered a 40-item test

in english for her Grade VI pupils in
Malanao Elementary School. Below
are the scores of 15 pupils, find the
reliability using the Kuder-Richardson
formula.
ANALYSIS

The reliability coefficient using KR-21

formula is 0.90 which means that the test
has a very good reliability. Meaning, the
test is very good for a classroom test.
Steps in Solving the Reliability
Coefficient
1. Solve the Using
difficulty index KR20
of each item(p).
2. Solve the value of q in each item.
3. Find the product of p and q columns.
4. Find the summation of pq.
5. Solve the variance of the scores.
6. Solve the reliability coefficient using KR-20
formula.

The first thing to do is solve the difficulty

index in each and the variance of the total
EXAMPLE 5

Mr. Mark Anthony administered a 20-item

true or false test for his English IV class.
Below are the scores of 40 students. Find the
reliability coefficient using the KR-20 formula
and interpret the computed value, solve also
the coefficient of determination
Interpretation:
The reliability coefficient using the KR20 =
0.98 means that it has a very high reliability
or excellent reliability.

2
Coefficient of determination = (0.98)
=0.9604
= 96.04%
Interpretation:
96.04% of the variance in the students'
performance can be attributed to the test.
THANK
YOU

Critical Thinking Self Assessment Scale (CTSAS)
100% (2)
Critical Thinking Self Assessment Scale (CTSAS)
244 pages
Characteristics of A Good Test
50% (2)
Characteristics of A Good Test
5 pages
Chapter 4 - Validity and Reliability
100% (2)
Chapter 4 - Validity and Reliability
23 pages
Meeting the Assessment Requirements of the Award in Education and Training
From Everand
Meeting the Assessment Requirements of the Award in Education and Training
Nabeel Zaidi
No ratings yet
20MBT206 - Business Research Methods Notes - DR K S Usman
No ratings yet
20MBT206 - Business Research Methods Notes - DR K S Usman
35 pages
Analysis of A Complex of Statistical Variables Into Principal Components
No ratings yet
Analysis of A Complex of Statistical Variables Into Principal Components
25 pages
Module 6: Assessment of Learning 1
No ratings yet
Module 6: Assessment of Learning 1
14 pages
Assessment of Learning: Proverbs 3:3
No ratings yet
Assessment of Learning: Proverbs 3:3
2 pages
M2L7 - Establishing Validity and Reliability of Tests
No ratings yet
M2L7 - Establishing Validity and Reliability of Tests
37 pages
PE 7 MODULE 7 Correct
No ratings yet
PE 7 MODULE 7 Correct
8 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
41 pages
Group 8 Handout
No ratings yet
Group 8 Handout
3 pages
AIL Unit 3
No ratings yet
AIL Unit 3
26 pages
Chapter 7 1
No ratings yet
Chapter 7 1
46 pages
Chapter 6.assessment of Learning 1
No ratings yet
Chapter 6.assessment of Learning 1
11 pages
Module 4 Psychometric Properties
No ratings yet
Module 4 Psychometric Properties
49 pages
Topic 8F Validity Reliability and Sources of Error
No ratings yet
Topic 8F Validity Reliability and Sources of Error
24 pages
Measuring Instrument Module 2
No ratings yet
Measuring Instrument Module 2
10 pages
Validity and Reliability
No ratings yet
Validity and Reliability
31 pages
Quality of A Test
No ratings yet
Quality of A Test
7 pages
Establishing Validity and Reliability
No ratings yet
Establishing Validity and Reliability
39 pages
Qualities of Test (Validity & Relibility Etc)
No ratings yet
Qualities of Test (Validity & Relibility Etc)
38 pages
Word High Quality Assessment Componentsr
No ratings yet
Word High Quality Assessment Componentsr
8 pages
Validity and Reliability Lesson 3.
No ratings yet
Validity and Reliability Lesson 3.
48 pages
Paprint
No ratings yet
Paprint
3 pages
Validity & Realibility
No ratings yet
Validity & Realibility
13 pages
PT Presentaion
No ratings yet
PT Presentaion
25 pages
Establishing Validity-and-Reliability-Test
No ratings yet
Establishing Validity-and-Reliability-Test
28 pages
Chapter 4 Assessment & Evaluation
No ratings yet
Chapter 4 Assessment & Evaluation
10 pages
Validity & Reliability
No ratings yet
Validity & Reliability
27 pages
Test - Education (1) STANDARDIZED TESTS
No ratings yet
Test - Education (1) STANDARDIZED TESTS
9 pages
Validity TM
No ratings yet
Validity TM
8 pages
Educ Measurement Prelim
No ratings yet
Educ Measurement Prelim
24 pages
Test Validity
No ratings yet
Test Validity
5 pages
KPD Validity & Realibility
No ratings yet
KPD Validity & Realibility
25 pages
What Is Validit1
No ratings yet
What Is Validit1
5 pages
Qualities of Good Test
No ratings yet
Qualities of Good Test
37 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
SPL-3 Unit 3
No ratings yet
SPL-3 Unit 3
4 pages
Assessment Good Test
No ratings yet
Assessment Good Test
24 pages
LESSON 6 Assessment Reviewer
No ratings yet
LESSON 6 Assessment Reviewer
7 pages
Validity and Reliability
No ratings yet
Validity and Reliability
6 pages
Validity&Reliability
No ratings yet
Validity&Reliability
16 pages
0520 20Validity2020Reliability
No ratings yet
0520 20Validity2020Reliability
37 pages
Validity & Reliability (Chapter 4 - Learning Assessment)
No ratings yet
Validity & Reliability (Chapter 4 - Learning Assessment)
75 pages
Psycho Metric Properties of Tests
No ratings yet
Psycho Metric Properties of Tests
8 pages
Measuring Reliability and Validity
No ratings yet
Measuring Reliability and Validity
18 pages
Chapter 3-Lesson1 Assessment Method
No ratings yet
Chapter 3-Lesson1 Assessment Method
31 pages
Establishing Test Validity and Reliability
No ratings yet
Establishing Test Validity and Reliability
33 pages
Validity: MEAM 607 - Advanced Test and Measurement By: Sherwin Trinidad
No ratings yet
Validity: MEAM 607 - Advanced Test and Measurement By: Sherwin Trinidad
38 pages
The Principles of Language Assessment"
No ratings yet
The Principles of Language Assessment"
13 pages
Reliability and Validity
No ratings yet
Reliability and Validity
21 pages
Scales Reliability and Validity
No ratings yet
Scales Reliability and Validity
10 pages
Ed 216 NOTES
No ratings yet
Ed 216 NOTES
21 pages
Psy 323 Topic 3
No ratings yet
Psy 323 Topic 3
5 pages
Validity
No ratings yet
Validity
6 pages
What Is Reliability
No ratings yet
What Is Reliability
2 pages
Reliability
No ratings yet
Reliability
4 pages
Validity and Reliability
No ratings yet
Validity and Reliability
13 pages
Chapter 6 Validity
No ratings yet
Chapter 6 Validity
28 pages
Master the Essentials of Assessment and Evaluation: Pedagogy of English, #4
From Everand
Master the Essentials of Assessment and Evaluation: Pedagogy of English, #4
Dr. Jayanthi N.L.N.
No ratings yet
The Complete ISEE Upper Level Test Prep Book: Over 3000 Practice Questions to Help You Pass Your Exam
From Everand
The Complete ISEE Upper Level Test Prep Book: Over 3000 Practice Questions to Help You Pass Your Exam
Caleb Roster
No ratings yet
Evaluating a Psychometric Test as an Aid to Selection
From Everand
Evaluating a Psychometric Test as an Aid to Selection
Zuzana Robertson C.Psychol
5/5 (1)
Tengan, C., Et Al
No ratings yet
Tengan, C., Et Al
112 pages
CCH 12287
No ratings yet
CCH 12287
11 pages
Trang Thong Tin Luan An - ĐC Khoa - Tieng Anh
No ratings yet
Trang Thong Tin Luan An - ĐC Khoa - Tieng Anh
3 pages
Consumer Behaviour Towards Electric Vehicles
No ratings yet
Consumer Behaviour Towards Electric Vehicles
19 pages
Lefs
No ratings yet
Lefs
8 pages
Crenshaw Et Al (2017) - Revised Scoring For The Communication Patterns Questionnaire
No ratings yet
Crenshaw Et Al (2017) - Revised Scoring For The Communication Patterns Questionnaire
16 pages
Online Banking User Interface: Perception and Attitude: April 2015
No ratings yet
Online Banking User Interface: Perception and Attitude: April 2015
7 pages
Practice Test For Professional Education 5
No ratings yet
Practice Test For Professional Education 5
8 pages
Fpubh 12 1382910
No ratings yet
Fpubh 12 1382910
10 pages
E222300-Nafihath Nawaz Gani - Marketing Dessertation
No ratings yet
E222300-Nafihath Nawaz Gani - Marketing Dessertation
42 pages
Customer Perceptions of Service Recovery and Complaints Handling Efforts by Commercial Banks in Zimbabwe
No ratings yet
Customer Perceptions of Service Recovery and Complaints Handling Efforts by Commercial Banks in Zimbabwe
10 pages
Business Research Methods Unit 2
No ratings yet
Business Research Methods Unit 2
35 pages
Kouzes and Posner's Transformational Leadership Model in Practice: The Case of Jordanian Schools
No ratings yet
Kouzes and Posner's Transformational Leadership Model in Practice: The Case of Jordanian Schools
19 pages
Faculty of Educational Studies University Putra Malaysia 43400 UPM Serdang, Selangor Malaysia
No ratings yet
Faculty of Educational Studies University Putra Malaysia 43400 UPM Serdang, Selangor Malaysia
11 pages
CCT - Meaning
No ratings yet
CCT - Meaning
10 pages
Seven: Measurement of Variables: Operational Definition and Scales
No ratings yet
Seven: Measurement of Variables: Operational Definition and Scales
49 pages
Ruler Drop Test
100% (1)
Ruler Drop Test
3 pages
Abdul Aziz Abdullah, Inter
No ratings yet
Abdul Aziz Abdullah, Inter
13 pages
Ethical Leadership
No ratings yet
Ethical Leadership
14 pages
(ESSA) Scale Article With Scoring
No ratings yet
(ESSA) Scale Article With Scoring
13 pages
Study To Assess The Effectiveness of Structured Teaching Programme Regarding Knowledge On Geriatric Care Among The Gerontological Nurses Working in Selected Old Age Homes in Kanyakumari District
No ratings yet
Study To Assess The Effectiveness of Structured Teaching Programme Regarding Knowledge On Geriatric Care Among The Gerontological Nurses Working in Selected Old Age Homes in Kanyakumari District
10 pages
Malewar Và Bajaj (2020)
No ratings yet
Malewar Và Bajaj (2020)
18 pages
Module 2 Principles of High Quality Assessment
No ratings yet
Module 2 Principles of High Quality Assessment
11 pages
The Influence of Mind Mapping Learning
No ratings yet
The Influence of Mind Mapping Learning
3 pages
Syllabus M.SC - Psychology 2019
No ratings yet
Syllabus M.SC - Psychology 2019
78 pages
Analyzing Textbook Requirements To Create Physics Learning Resources
No ratings yet
Analyzing Textbook Requirements To Create Physics Learning Resources
10 pages
Effects of Gender Roles in Artisanal Fishing in Rivers State, Nigeria
No ratings yet
Effects of Gender Roles in Artisanal Fishing in Rivers State, Nigeria
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Al2 Report

Uploaded by

Al2 Report

Uploaded by

ESTABLISHING

a. Convergent validity is a type of

There are other ways of assessing construct validity

The reliability of a test can be determined

2. Scoring reliability limits test score

4. Item difficulty affects test score

Note: Compute the reliability coefficient of

Note: Compute the reliability coefficient of

Prof. Glenn Lord conducted a test to his 10

The reliability coefficient using Brown

Ms. Gauat administered a 40-item test

The reliability coefficient using KR-21

The first thing to do is solve the difficulty

Mr. Mark Anthony administered a 20-item

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.