Qualities of An Evaluation Tool
Qualities of An Evaluation Tool
EVALUATION
INSTRUMENTS
Mrs. Shiji Thomas
Professor
Caritas College of Nursing
Characteristics/ Qualities of evaluation
procedures
Essential qualities Other qualities
• Validity • Relevance
• Reliability • Equilibrium
• Objectivity • Discrimination
• Usability
1. Validity
• The extent to which the test really measures what it is intended to measure
• It refers to the appropriateness of the interpretations made from test scores and other evaluation
results , with regard to a particular use
• Validity is always concerned with the specific use of the results and the soundness of our
proposed interpretations
• Validity is relative and always specific for a particular test
• To be valid, the measuring instrument should be reliable and relevant
• As the reliability of a test increases, it becomes more valid
• Validity of a test is the relevance of a test to its objective
• Validity pertains to the results of a test and not to the instrument itself
• Validity is always specific to some particular use ; it is not a general
quality of a test
Nature of validity
• Validity refers to the appropriateness of the interpretation of the results of
a test or evaluation instrument for a given group of individuals, and not to
the instrument itself
• Validity is a matter of degree(high validity, moderate validity & low
validity); it does not exist on an all-or-none basis
• Validity is always specific to some particular use or interpretation
• Validity is a unitary concept
Procedure Meaning
Approaches to test validation
Content related evidence Compare the test tasks to How well the sample of
the test specifications test task represents the
describing the task domain of tasks to be
domain under measured
consideration
Criterion related Compare test scores How well test
evidence with another measure of performance predicts
performance obtained at future performance or
a later date(for estimates current
prediction) or with performance on some
another measure of valued measure other
performance obtained than the test itself(called
concurrently(for a criterion
estimating the present
status)
Procedure Meaning
Approaches
Construct related
evidence
to
Establish the test
meaningvalidation
of How well the test
the scores on the test by performance can be
controlling the interpreted as a
development of the test, meaningful measure of
evaluating the some characteristic or
relationships of the quality?
scores with other
relevant measures, and
experimentally
determining what factors
influence test
performance
Content related evidence
• Content validation is a process of determining the extent to which a set of
test tasks provides a relevant and representative sample of the domain of
tasks about which interpretations of tests scores are made
Content validation in the testing of classroom
achievement
Classroom instruction
Determines which intended learning outcomes(objectives) are to be
achieved by pupils
Achievement domain
Specifies and delimits a set of instructionally relevant learning tasks to be
measured by a test
Achievement test
Provides a set of relevant test items designed to measure a representative
sample of the tasks in the achievement domain
Content validation and test development
• Identifying the learning outcomes to be measured
• Preparing a test plan that specifies the sample of items to be used
• Construct a test that closely fits the set of test specifications
Table of specifications
• The content of a course or curriculum may be broadly defined to include
both subject matter content and instructional objectives
• The former is concerned with the topics to be learned and the latter with
the types of performance pupils are expected to demonstrate (eg: knows,
comprehends, applies)
Table of specifications showing the relative
emphasis in percent to be given to the content area
Content area Instructional objectives
and
Knowsinstructional
Comprehends objectives
Applies Total
concepts concepts concepts
Plants 8 4 4 16
Animals 10 5 5 20
Weather 12 8 8 28
Earth 12 4 2 18
Sky 8 4 6 18
TOTAL 50 25 25 100
Criterion related evidence
• Defined as the process of determining the extent to which test
performance is related to some other valued measure of performance
• The second measure of performance(criterion) may be obtained at some
future date(interested in predicting future performance) or concurrently
(interested in estimating present performance)
Predictive validation study
September 17 December 10
Scholastic aptitude scores Achievement test scores
(test performance) (criterion performance)
September 17 September 17
Scholastic aptitude scores Achievement test scores
(test performance) (criterion performance)
• The key element in both types of criterion related study is the degree of relationship
between the two sets of measure: 1. the test scores and 2. the criterion to be predicted
• The relationship is expressed by means of a correlation coefficient or an expectancy table
• A correlation coefficient(r) indicates the degree of relationship between two sets of
measures
• 1.00 = perfect positive correlation
• .00 = no relationship
• -1.00= perfect negative relationship
• When correlation coefficient is used to express the degree of relationship
between a set of test scores and some criterion measure, it is called
validity coefficient
• Validity coefficients must be judged on a relative basis, the larger
coefficients being favored
• Expectancy table is a simple and practical means of expressing criterion
related evidence of validity
• A two fold chart with the test scores(the predictor) arranged in categories
down the left side of the table and the measure to be predicted(the
criterion)arranged in categories across the top of the table
• For each category of scores on the predictor, the table indicates the
percentage of individuals who fall within each category of the criterion
Expectancy table showing the relation between scholastic
aptitude scores and course grades for 30 students in science
Grouped course
PERCENTAGE IN EACH SCORE CATEGORY RECEIVING EACH GRADE
scholastic
aptitude
scores
(STANINES)
E D C B A
ABOVE 14 43 43
AVERAGE
(7,8,9)
AVERAGE 19 37 25 19
(4,5,6)
BELOW 57 29 14
AVERAGE
(1,2,3)
Construct related evidence/validity
• The construct related category of evidence focuses on test performance as a
basis for inferring the possession of certain psychological characteristics
• A construct is a psychological quality that we assume exists in order to
explain some aspect of behavior
• Eg mathematical reasoning, intelligence, creativity, honesty, anxiety etc
• Construct validation may be defined as a process of determining the extent to
which test performance can be interpreted in terms of one or more
psychological construct
Process of construct validation
• Identifying and describing, by means of a theoretical framework, the
meaning of the construct to be measured
• Deriving hypotheses regarding test performance from the theory
underlying the construct and
• Verifying the hypotheses by logical and empirical means
Factors influencing validity
• Unclear directions
• Reading vocabulary and sentence structure too difficult
• Inappropriate level of difficulty of the test items
• Poorly constructed test items
• Ambiguity
• Test items inappropriate for the outcomes being measured
• Inadequate time limits
• Test too short
• Improper arrangement of items
• Identifiable patterns of answers
2. RELIABILITY
Reliability refers to the consistency of measurement- that is , how consistent
test scores or other evaluation results are form one measurement to another
• Reliability of test scores is typical;;y reported by means of a reliability
coefficient or the standard error of measurement
Reliability coefficient