0% found this document useful (0 votes)

760 views42 pages

Qualities of An Evaluation Tool

This document discusses key qualities and characteristics of evaluation instruments, focusing on validity and reliability. It defines validity as the extent to which a test measures what it intends to measure and discusses different types of validity evidence including content, criterion, and construct validity. Reliability is defined as the consistency of measurement. Validity is concerned with the appropriateness of test interpretation while reliability pertains to consistency of results. Several factors can influence the validity of an instrument like unclear directions, inappropriate difficulty level, or ambiguity.

Uploaded by

ShijiThomas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

760 views42 pages

Qualities of An Evaluation Tool

Uploaded by

ShijiThomas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 42

QUALITIES OF

EVALUATION
INSTRUMENTS
Mrs. Shiji Thomas
Professor
Caritas College of Nursing
Characteristics/ Qualities of evaluation
procedures
Essential qualities Other qualities
• Validity • Relevance
• Reliability • Equilibrium
• Objectivity • Discrimination
• Usability
1. Validity
• The extent to which the test really measures what it is intended to measure
• It refers to the appropriateness of the interpretations made from test scores and other evaluation
results , with regard to a particular use
• Validity is always concerned with the specific use of the results and the soundness of our
proposed interpretations
• Validity is relative and always specific for a particular test
• To be valid, the measuring instrument should be reliable and relevant
• As the reliability of a test increases, it becomes more valid
• Validity of a test is the relevance of a test to its objective
• Validity pertains to the results of a test and not to the instrument itself
• Validity is always specific to some particular use ; it is not a general
quality of a test
Nature of validity
• Validity refers to the appropriateness of the interpretation of the results of
a test or evaluation instrument for a given group of individuals, and not to
the instrument itself
• Validity is a matter of degree(high validity, moderate validity & low
validity); it does not exist on an all-or-none basis
• Validity is always specific to some particular use or interpretation
• Validity is a unitary concept
Procedure Meaning
Approaches to test validation
Content related evidence Compare the test tasks to How well the sample of
the test specifications test task represents the
describing the task domain of tasks to be
domain under measured
consideration
Criterion related Compare test scores How well test
evidence with another measure of performance predicts
performance obtained at future performance or
a later date(for estimates current
prediction) or with performance on some
another measure of valued measure other
performance obtained than the test itself(called
concurrently(for a criterion
estimating the present
status)
Procedure Meaning
Approaches
Construct related
evidence
to
Establish the test
meaningvalidation
of How well the test
the scores on the test by performance can be
controlling the interpreted as a
development of the test, meaningful measure of
evaluating the some characteristic or
relationships of the quality?
scores with other
relevant measures, and
experimentally
determining what factors
influence test
performance
Content related evidence
• Content validation is a process of determining the extent to which a set of
test tasks provides a relevant and representative sample of the domain of
tasks about which interpretations of tests scores are made
Content validation in the testing of classroom
achievement
Classroom instruction
Determines which intended learning outcomes(objectives) are to be
achieved by pupils

Achievement domain
Specifies and delimits a set of instructionally relevant learning tasks to be
measured by a test

Achievement test
Provides a set of relevant test items designed to measure a representative
sample of the tasks in the achievement domain
Content validation and test development
• Identifying the learning outcomes to be measured
• Preparing a test plan that specifies the sample of items to be used
• Construct a test that closely fits the set of test specifications
Table of specifications
• The content of a course or curriculum may be broadly defined to include
both subject matter content and instructional objectives
• The former is concerned with the topics to be learned and the latter with
the types of performance pupils are expected to demonstrate (eg: knows,
comprehends, applies)
Table of specifications showing the relative
emphasis in percent to be given to the content area
Content area Instructional objectives
and
Knowsinstructional
Comprehends objectives
Applies Total
concepts concepts concepts
Plants 8 4 4 16

Animals 10 5 5 20

Weather 12 8 8 28

Earth 12 4 2 18

Sky 8 4 6 18

TOTAL 50 25 25 100
Criterion related evidence
• Defined as the process of determining the extent to which test
performance is related to some other valued measure of performance
• The second measure of performance(criterion) may be obtained at some
future date(interested in predicting future performance) or concurrently
(interested in estimating present performance)
Predictive validation study

September 17 December 10
Scholastic aptitude scores Achievement test scores
(test performance) (criterion performance)

Concurrent validation study

September 17 September 17
Scholastic aptitude scores Achievement test scores
(test performance) (criterion performance)
• The key element in both types of criterion related study is the degree of relationship
between the two sets of measure: 1. the test scores and 2. the criterion to be predicted
• The relationship is expressed by means of a correlation coefficient or an expectancy table
• A correlation coefficient(r) indicates the degree of relationship between two sets of
measures
• 1.00 = perfect positive correlation
• .00 = no relationship
• -1.00= perfect negative relationship
• When correlation coefficient is used to express the degree of relationship
between a set of test scores and some criterion measure, it is called
validity coefficient
• Validity coefficients must be judged on a relative basis, the larger
coefficients being favored
• Expectancy table is a simple and practical means of expressing criterion
related evidence of validity
• A two fold chart with the test scores(the predictor) arranged in categories
down the left side of the table and the measure to be predicted(the
criterion)arranged in categories across the top of the table
• For each category of scores on the predictor, the table indicates the
percentage of individuals who fall within each category of the criterion
Expectancy table showing the relation between scholastic
aptitude scores and course grades for 30 students in science
Grouped course
PERCENTAGE IN EACH SCORE CATEGORY RECEIVING EACH GRADE
scholastic
aptitude
scores
(STANINES)
E D C B A
ABOVE 14 43 43
AVERAGE
(7,8,9)
AVERAGE 19 37 25 19
(4,5,6)
BELOW 57 29 14
AVERAGE
(1,2,3)
Construct related evidence/validity
• The construct related category of evidence focuses on test performance as a
basis for inferring the possession of certain psychological characteristics
• A construct is a psychological quality that we assume exists in order to
explain some aspect of behavior
• Eg mathematical reasoning, intelligence, creativity, honesty, anxiety etc
• Construct validation may be defined as a process of determining the extent to
which test performance can be interpreted in terms of one or more
psychological construct
Process of construct validation
• Identifying and describing, by means of a theoretical framework, the
meaning of the construct to be measured
• Deriving hypotheses regarding test performance from the theory
underlying the construct and
• Verifying the hypotheses by logical and empirical means
Factors influencing validity
• Unclear directions
• Reading vocabulary and sentence structure too difficult
• Inappropriate level of difficulty of the test items
• Poorly constructed test items
• Ambiguity
• Test items inappropriate for the outcomes being measured
• Inadequate time limits
• Test too short
• Improper arrangement of items
• Identifiable patterns of answers
2. RELIABILITY
Reliability refers to the consistency of measurement- that is , how consistent
test scores or other evaluation results are form one measurement to another
• Reliability of test scores is typical;;y reported by means of a reliability
coefficient or the standard error of measurement
Reliability coefficient

• A correlation coefficient that indicates the degree of relationship between

two sets of measures obtained from the same procedure
• We may administer the same test twice to a group, with a time interval in
between(test-retest method), administer two equivalent forms of the test in
close succession(Equivalent forms method), administer two equivalent
forms of a the test with a time interval in between (test-retest with
equivalent forms method) or administer the test once and compute the
consistency of responses within the test(internal consistency method)
Methods of estimating reliability
Method Type of information provided
Test-retest method The stability of test scores over a given period of
time
Equivalent forms method The consistency of the test scores over different
forms of the test (that is different samples of
items)
Test=retest with equivalent forms The consistency of test scores over both a time
interval and different forms of the test
Internal consistency method The consistency of test scores over different parts
of the test
Test-retest method
• Requires administering the same form of the test to the same group after some time
interval
• The length of time interval should fit the type of interpretation to be made from the
results.
• Test retest reliability coefficients are influenced both by errors within the
measurement procedures and by the day-to-day stability of the students responses
• Longer time periods between testing will result in low reliability coefficients, due to
greater changes in the students
• The report is “the stability of test scores obtained in the same from over a three month
period was .90”
Equivalent forms method
• Two equivalent forms of a test(also called alternate forms or parallel
forms) are administered to the same group during the same testing session
• The test forms are equivalent in the sense that they are built to measure
the same abilities, constructed independently
• A high reliability coefficient indicates the adequacy of the test sample
• A high reliability coefficient would indicate that the two forms are
measuring apparently the same thing
Test –retest method with equivalent forms
• Combination of previous two methods
• Two different forms of the same test are administered with time
intervening
• This is the most demanding estimate of reliability, since it takes into
account all possible source of variation
• The reliability coefficient reflects errors within the testing procedure,
consistency over different samples of items and the day to day stability of
the student’s responses
Internal consistency methods
• Require only a single administration of a test
• Split-half method, involves scoring the odd items and the even items
separately and correlating the two sets of scores.
• This correlation coefficient indicates the degree to which the two
arbitrarily selected halves of the test provide the same results.
• The reliability coefficient for the total test is determined by applying the
Spearman-Brown Prophecy formula
Spearman-Brown Prophecy formula

• Reliability of total test = 2 x reliability of ½ test

•
• 1 + reliability of ½ test
• Eg: if we obtained a correlation coefficient of .60 for two halves of a test,
the reliability for the total test is computes as
• 2 x .60 / 1+ .60 = .75
Kuder – Richardson formula
• Used to estimate the reliability of test scores rom single administration of
the test.
• It requires three type of information
1. The number of items in the test
2. The mean and
3. The standard deviation
Kuder – Richardson formula contd..
• Reliability estimate(KR 21) = 1 - M (K –M)
• K (s2)
• K = No. of items in the test
• M= The mean of the test scores & s = SD of test scores
• the reliability coefficients for classroom tests typically range between .
60 and .80
Factors that lower the reliability of test scores

• Test items are based on too few items

• Range of score is too limited
• Testing conditions are too inadequate
• Scoring is subjective
Standard error of measurement
• Especially useful way of expressing test reliability because it indicates
the amount of error to allow for when interpreting individual test scores
• SEM = s √ 1 – r n
• s = Standard deviation
• r n = Reliability coefficient
• Eg: SEM = 4.5 √ 1 – .61 = 2.8 , approximately 3
Standard error of measurement
• The SEM shows how many points we must add to, and subtract from, an
individual’s test score in order to obtain “ reasonable limits” for
estimating that individuals true score
3. OBJECTIVITY
• It means that an individual’s score is the same, or essentially the same,
regardless of who is doing the scoring
• Objectivity has two aspects- objectivity of test and that of scoring
• Objectivity of the test is defined as the extent to which a student’s score is
based on his actual answer or performance on the test and not only the opinion
of the examiner
• Usually determined by having precise questions and predetermined scoring
scheme
4. COMPREHENSIVENESS
• The test should have adequate sample of major lesson objectives to
provide a valid measure of student achievement
5. DISCRIMINATION
• The test should be constructed in such a manner that it will detect or
measure small differences in achievement
• Essential if the test is used to rank the students on the basis of individual
achievement or for assigning grades
• can be determined by item analysis
6. USABILITY
• In selecting tests and other evaluation instruments, practical
considerations should not be neglected
• Consider the expertise of teachers in measurement, time available, cost of
testing, ease of interpretation etc
7. RELEVANCE
• The test should contain only relevant tasks
8. EQUILIBRIUM
• A balanced assessment sets target in all domains of learning and all
domains of intelligence
Miscellaneous qualities
• Fairness
• Administrability
• Scorability
• Practicality and efficiency

Filipino Books and Textbooks - Phoenix Publishing House
46% (13)
Filipino Books and Textbooks - Phoenix Publishing House
3 pages
Approaches To Common Genetic Disorders
No ratings yet
Approaches To Common Genetic Disorders
36 pages
Competency Based Learning Material Graphics Design Student Guide
No ratings yet
Competency Based Learning Material Graphics Design Student Guide
190 pages
Language Teacher Educator Identity
100% (16)
Language Teacher Educator Identity
92 pages
FF Staff Relationship and Techniques
100% (1)
FF Staff Relationship and Techniques
15 pages
Writing Selection Items
No ratings yet
Writing Selection Items
57 pages
Educational Philosophies
100% (1)
Educational Philosophies
94 pages
Nursing Education
100% (2)
Nursing Education
21 pages
Oxygen Insufficiency: Presented by
No ratings yet
Oxygen Insufficiency: Presented by
66 pages
15th May-NE-Teachers Role in Procuring and Managing Instructional Aids
No ratings yet
15th May-NE-Teachers Role in Procuring and Managing Instructional Aids
74 pages
Role Av Aids in Clinical Teaching
100% (1)
Role Av Aids in Clinical Teaching
16 pages
Mo Handbook Fbimnci Apr 18, 2019
100% (1)
Mo Handbook Fbimnci Apr 18, 2019
296 pages
Standardized and Non Standardized Test
100% (1)
Standardized and Non Standardized Test
89 pages
Practice Teaching On Administer A Test, Score, Grade VS Mark
No ratings yet
Practice Teaching On Administer A Test, Score, Grade VS Mark
13 pages
Scales
100% (2)
Scales
41 pages
10 Qualities of A Great Nurse Educator 12 - Doc - 1
100% (1)
10 Qualities of A Great Nurse Educator 12 - Doc - 1
3 pages
Presented by Ms. Pallavi Charade
No ratings yet
Presented by Ms. Pallavi Charade
10 pages
Developing Budget Proposal
No ratings yet
Developing Budget Proposal
7 pages
Nursing Regulatory Mechanism
No ratings yet
Nursing Regulatory Mechanism
26 pages
Vision and Mission in Nursing Administration
No ratings yet
Vision and Mission in Nursing Administration
4 pages
Teachers Role in Managing Non Projected A
No ratings yet
Teachers Role in Managing Non Projected A
23 pages
Standardized & Non Standardized Tests
100% (1)
Standardized & Non Standardized Tests
45 pages
Organization PPT Unit - IV
No ratings yet
Organization PPT Unit - IV
37 pages
CONFERENCE
No ratings yet
CONFERENCE
11 pages
Pilot Study: R. Bhakialakshmi, M. Sc. Nursing, 1 Year 2018 Batch, PSG College of Nursing
100% (1)
Pilot Study: R. Bhakialakshmi, M. Sc. Nursing, 1 Year 2018 Batch, PSG College of Nursing
17 pages
Anecdotal Records
No ratings yet
Anecdotal Records
26 pages
Regulatory Bodies
No ratings yet
Regulatory Bodies
95 pages
Subject: Advanced Nursing Practice Topic: Treatment Aspects: Pharmacological and Pre and Post Operative Care Aspects
100% (1)
Subject: Advanced Nursing Practice Topic: Treatment Aspects: Pharmacological and Pre and Post Operative Care Aspects
44 pages
Evaluation of Educational Programmes in Nursing-Course and Programme
No ratings yet
Evaluation of Educational Programmes in Nursing-Course and Programme
61 pages
Current Trends Issues in Nursing Education Nursing Education
No ratings yet
Current Trends Issues in Nursing Education Nursing Education
41 pages
AC Code of Ethis
No ratings yet
AC Code of Ethis
61 pages
Educational Domains and Objectives
No ratings yet
Educational Domains and Objectives
17 pages
Cbe Obe
No ratings yet
Cbe Obe
40 pages
Review of Literature
No ratings yet
Review of Literature
4 pages
Perspective of Nursing Education Global and National: By: Sardi Middle East University Kuwait
100% (1)
Perspective of Nursing Education Global and National: By: Sardi Middle East University Kuwait
40 pages
Welfare & Library Services
No ratings yet
Welfare & Library Services
23 pages
A.V Aids
No ratings yet
A.V Aids
9 pages
Imogene King
No ratings yet
Imogene King
33 pages
Aims and Concept of Nursing Education
100% (1)
Aims and Concept of Nursing Education
5 pages
Standardized Tools
No ratings yet
Standardized Tools
41 pages
Seminar ON Projected Av Aids: By: Ravindra Kumar EN. ROLL NO.-TPE1903013
No ratings yet
Seminar ON Projected Av Aids: By: Ravindra Kumar EN. ROLL NO.-TPE1903013
43 pages
Research in Continuing Education
100% (3)
Research in Continuing Education
2 pages
POSDCORB - Management Process: Planning
100% (1)
POSDCORB - Management Process: Planning
8 pages
Formulation of Nursing Care Plans &amp Health Goals
No ratings yet
Formulation of Nursing Care Plans &amp Health Goals
11 pages
Appraisal of Newborn
No ratings yet
Appraisal of Newborn
76 pages
Preeti Jaiswal M.Sc. (N) 1 YR
No ratings yet
Preeti Jaiswal M.Sc. (N) 1 YR
38 pages
Individual Conference
86% (7)
Individual Conference
19 pages
Collaberation
100% (2)
Collaberation
15 pages
PPDC
100% (1)
PPDC
13 pages
Role of Professional Associations and Unions Gauri
No ratings yet
Role of Professional Associations and Unions Gauri
18 pages
Teacher: Role and Responsibilitie S Functions, Characteristic S, Competencies and Qualities
No ratings yet
Teacher: Role and Responsibilitie S Functions, Characteristic S, Competencies and Qualities
80 pages
Evaluation of Clinical Practice in Nursing
No ratings yet
Evaluation of Clinical Practice in Nursing
18 pages
Educational Objectives
No ratings yet
Educational Objectives
26 pages
Futuristic Nursing: Presented By: Ms.A. Pushpalatha 1 MSC Nursing Vijya Marie College of Nursing
No ratings yet
Futuristic Nursing: Presented By: Ms.A. Pushpalatha 1 MSC Nursing Vijya Marie College of Nursing
27 pages
A Study To Assess The Knowledge of Mothers Regarding Weaning Among Children 6 Months To 2 Years of Age in Selected Rural Areas of Punjab India PDF
No ratings yet
A Study To Assess The Knowledge of Mothers Regarding Weaning Among Children 6 Months To 2 Years of Age in Selected Rural Areas of Punjab India PDF
3 pages
Postpartum Exercise Checklist
No ratings yet
Postpartum Exercise Checklist
3 pages
Innovations in Nursing.
No ratings yet
Innovations in Nursing.
8 pages
COLLABORATION ISSUES AND MOLDELS-WITHIN AND OUTSIDE NURSING Last Part
No ratings yet
COLLABORATION ISSUES AND MOLDELS-WITHIN AND OUTSIDE NURSING Last Part
6 pages
Pert
No ratings yet
Pert
34 pages
Dual Role of Nurse Administrator
100% (1)
Dual Role of Nurse Administrator
8 pages
Assignment On Philosophy and Objectives OF Indian Nursing Council and Yashoda College of Nursing
No ratings yet
Assignment On Philosophy and Objectives OF Indian Nursing Council and Yashoda College of Nursing
4 pages
Definition of TT
No ratings yet
Definition of TT
2 pages
Unit 8
No ratings yet
Unit 8
18 pages
National College of Nursing, Barwala, Hisar Lesson Plan
100% (2)
National College of Nursing, Barwala, Hisar Lesson Plan
18 pages
Module 8
No ratings yet
Module 8
3 pages
Qualities of Good Measuring Instruments
56% (9)
Qualities of Good Measuring Instruments
4 pages
Evaluation Extrenal Notes
No ratings yet
Evaluation Extrenal Notes
25 pages
MODULE 2 Measures of Central Tendency
No ratings yet
MODULE 2 Measures of Central Tendency
8 pages
Lesson Planning (2 HRS) : Specific Objectives
No ratings yet
Lesson Planning (2 HRS) : Specific Objectives
14 pages
Course Plan & Unit Plan
No ratings yet
Course Plan & Unit Plan
8 pages
Educational Objectives (3Hrs) Specific Objectives
No ratings yet
Educational Objectives (3Hrs) Specific Objectives
15 pages
Statistics Revision Questions
No ratings yet
Statistics Revision Questions
3 pages
Differentiation Question List
No ratings yet
Differentiation Question List
2 pages
Curriculum Evaluation & Curriculum Change: Mrs. Shiji Thomas Caritas College of Nursing
No ratings yet
Curriculum Evaluation & Curriculum Change: Mrs. Shiji Thomas Caritas College of Nursing
56 pages
MODULE 1 Introduction, Levels of Measurement, Frequency Distribution
No ratings yet
MODULE 1 Introduction, Levels of Measurement, Frequency Distribution
25 pages
Eductaional Philosophy
No ratings yet
Eductaional Philosophy
18 pages
Research Critique
No ratings yet
Research Critique
22 pages
Course Plan & Unit Plan
100% (1)
Course Plan & Unit Plan
23 pages
Edu 1 Meaning, Definition, Aims, Functions
100% (1)
Edu 1 Meaning, Definition, Aims, Functions
74 pages
An Introduction To Evaluation: Ms. Shiji Thomas Professor Caritas College of Nursing
No ratings yet
An Introduction To Evaluation: Ms. Shiji Thomas Professor Caritas College of Nursing
73 pages
Multiple Choice Questions
No ratings yet
Multiple Choice Questions
15 pages
Chapter 3
No ratings yet
Chapter 3
8 pages
2.9. Critical Care Nursing
No ratings yet
2.9. Critical Care Nursing
88 pages
cbc9 PDF
No ratings yet
cbc9 PDF
10 pages
"Subtype of Autism: Developmental Verbal Dyspraxia": Social Science Abstracts
No ratings yet
"Subtype of Autism: Developmental Verbal Dyspraxia": Social Science Abstracts
1 page
Lesson Plan Observation
No ratings yet
Lesson Plan Observation
4 pages
Team Teaching Seminar Final
No ratings yet
Team Teaching Seminar Final
56 pages
Emotional Intelligence Questionnaire
50% (2)
Emotional Intelligence Questionnaire
3 pages
Types of Reference HALIDAY
No ratings yet
Types of Reference HALIDAY
26 pages
Detailed Lesson Plan in Reading and Writing Skills
No ratings yet
Detailed Lesson Plan in Reading and Writing Skills
3 pages
The Optimism Bias by Tali Sharot: Extract - Science
No ratings yet
The Optimism Bias by Tali Sharot: Extract - Science
9 pages
Journal of Early Childhood Education Programs
No ratings yet
Journal of Early Childhood Education Programs
19 pages
Mydev Module 4 Answer Sheet
100% (1)
Mydev Module 4 Answer Sheet
4 pages
PDF Chapter 14 Managing in A Non Unionized Environment
No ratings yet
PDF Chapter 14 Managing in A Non Unionized Environment
9 pages
MYP Community Project Process Journal
100% (2)
MYP Community Project Process Journal
16 pages
Miaa 340 High Yield Routine
No ratings yet
Miaa 340 High Yield Routine
3 pages
Managing and The Manager's Job: Management: Principles and Practices
No ratings yet
Managing and The Manager's Job: Management: Principles and Practices
24 pages
Report On Students With Failing Grades
No ratings yet
Report On Students With Failing Grades
8 pages
DLL MTB-1 Q3 W4-Judith-Made
No ratings yet
DLL MTB-1 Q3 W4-Judith-Made
5 pages
Course Learning Outcome (CLO), Delivery and Assessment Template
No ratings yet
Course Learning Outcome (CLO), Delivery and Assessment Template
2 pages
God Blessed Us Thesis
No ratings yet
God Blessed Us Thesis
49 pages
Leadership: Research Findings, Practice, and Skills
100% (1)
Leadership: Research Findings, Practice, and Skills
11 pages
The Transformer Model in Equations: John Thickstun
No ratings yet
The Transformer Model in Equations: John Thickstun
5 pages
Teacher'S Data Sheet: Guinto ST., Purok Malakas, Brgy. San Isidro, General Santos City
No ratings yet
Teacher'S Data Sheet: Guinto ST., Purok Malakas, Brgy. San Isidro, General Santos City
4 pages
Collection Policy
No ratings yet
Collection Policy
3 pages
SITXHRM009 Assessment 1 - Short Answers
No ratings yet
SITXHRM009 Assessment 1 - Short Answers
23 pages
Smart Goals 2
No ratings yet
Smart Goals 2
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Qualities of An Evaluation Tool

Uploaded by

Qualities of An Evaluation Tool

Uploaded by

QUALITIES OF

Concurrent validation study

• A correlation coefficient that indicates the degree of relationship between

• Reliability of total test = 2 x reliability of ½ test

• Test items are based on too few items

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.