Basic Principles of Language Testing and Assessment
Basic Principles of Language Testing and Assessment
Reliability
Validity
Practicality
Washback
Usefulness
Transparency
Security
RELIABILITY
RELIABILTY
Reliability
/rɪˌlaɪ.əˈbɪl.ə.ti/
the quality of being able to
be trusted or believed because
of working or behaving well
RELIABILTY
The term reliability is used to
refer to the consistency of
test scores.
RELIABILTY
According to Brown (2010), a reliable test:
- Is consistent
- Gives clear direction for
scoring/evaluation
- Has uniform rubrics for scoring
- Contain items/tasks that are
unambiguous to the test-takers.
RELIABILTY
Reliability
- The degree or extent to which an
assessment tool produces stable and
consistent results.
- Consistency, stability, dependability and
accuracy of the test results. (McMillan,
2001)
RELIABILTY
Test- Retest Reliability
- The same test is re-administered to the
same people.
- It is expected the correlation between the
two scores of the two tests would be high.
- The effect of practice and memory may
influence the correlation value
RELIABILTY
Inter-Rater Reliability
- Two or more judges or raters are involved
in grading.
- The score is more reliable and accurate
measure if two or more raters agree on it
or they assign similar results.
RELIABILTY
Intra-Rater Reliability
- The consistency of grading by a single
rater.
- When a rater grades tests at different
time, he/she may become inconsistent in
grading for various reasons.
RELIABILTY
Test Administration Reliability
- This involves the conditions in which the
test is administered.
- Unreliability may occur due to outside
interference including noise, variations in
photocopying, light and sound in different
parts of the room.
RELIABILTY
Factors affecting test reliability
Test factor
Teacher and student factor
Environment factor
Test administration factor
Marking factor
RELIABILTY
1. Test factor
- Longer tests produce higher reliability
- Due to the dependency on coincidence
and guessing, the scores will be more
accurate it the duration of the test is
longer.
- An objective test has higher consistency
compared to a subjective test.
RELIABILTY
2. Teacher and student factor
- In most tests, the teachers normally construct
and administer tests for students.
- The teacher-student relationship would affect
the consistency of test result.
- Teacher’s encouragement, positive mental and
physical condition, familiarity to the test
formats could lead to higher consistency
RELIABILTY
3. Environment Factor
- An examination environment certainly
influence test-takers and their scores.
- Favorable environment will improve the
reliability of the test.
RELIABILTY
4. Test administration factor
- Students’ performance are dependent on
the way tests are administered (instruction,
time allowance, or careful monitoring of
tests).
RELIABILTY
5. Marking factor
- Human judges/raters have many
opportunities to introduce error in
scoring.
- Different raters may award different marks
for the same answer
VALIDITY
VALIDITY
The term VALIDITY is used to
refer to whether the test is
actually measuring what it
claims to measure (Arshad,
2004)
Test scores reflect the achievement of
validity
learning outcomes and test-taker’s
ability.
Reliability Validity
Reliability Validity
Inter-rater Construct
Intra-rater Face
Environment factors Content
Consistency Curriculum
Test results Outcomes
PRACTICALITY
PRACTICALITY “The logistical, down-to-earth administrative issues
involved in designing, admistering, and scoring.”
• …a speaking test that requires individual 10 minutes one-to-one talk for a group of 50
test-takers and only one scorer;
• ……a test that takes students a few minutes to complete and several hours for the
examiner to prepare and/or correct
• …a test which can be scored only by computer in a location without easy access to
computers and internet connection
AUTHENTICITY
AUTHENTICITY A PRACTICAL TEST
”The degree of correspondence of the
characteristics of a given language test task to the
features of a target language task” (Bachman &
Palmer, 1996)
•Provide a qualification
•Provide motivation
On learners
•Serve as a revision tool
•Provide feedback
•Identify struggling learners in a class
On teachers •Diagnose common learner errors to
modify instruction
On teaching •Increase accountability of school
institutions •Identify weaknesses of a syllabus
and schools
•Encourage a balanced curriculum
WASHBACK POSSIBLE NEGATIVE WASHBACK
Purpose: The test is intended to be used as a speaking test in the National examination
for high school students. During years at high school, the course book issued by the
Ministry of Education and Training is used.
Test adminstration: In each test location, there are about 500 Grade 12th students
whose expected level of proficiency is B1. There are about 30 examiners invited to be
raters and they are given a one-day training course on the assessment scale.
2 months prior to the test day, information about the test and its format is available on
the Website of Minstry of Education and Training. Information is also circulated to
highschools throughout the country.