0% found this document useful (0 votes)
10 views55 pages

Basic Principles of Language Testing and Assessment

The document outlines the basic principles of testing and assessment, including reliability, validity, practicality, authenticity, washback, transparency, and security. It emphasizes the importance of consistency in test scores, the need for tests to measure what they claim to measure, and the logistical considerations involved in test administration. Additionally, it discusses the impact of tests on teaching and learning, highlighting both positive and negative effects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views55 pages

Basic Principles of Language Testing and Assessment

The document outlines the basic principles of testing and assessment, including reliability, validity, practicality, authenticity, washback, transparency, and security. It emphasizes the importance of consistency in test scores, the need for tests to measure what they claim to measure, and the logistical considerations involved in test administration. Additionally, it discusses the impact of tests on teaching and learning, highlighting both positive and negative effects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Basic Principles

of Testing and Assessment


BASIC PRINCIPLES
OF TEST AND
ASSESSMENT
Basic principles of test and assessment

 Reliability
 Validity
 Practicality
 Washback
 Usefulness
 Transparency
 Security
RELIABILITY
RELIABILTY
Reliability
/rɪˌlaɪ.əˈbɪl.ə.ti/
the quality of being able to
be trusted or believed because
of working or behaving well
RELIABILTY
The term reliability is used to
refer to the consistency of
test scores.
RELIABILTY
According to Brown (2010), a reliable test:
- Is consistent
- Gives clear direction for
scoring/evaluation
- Has uniform rubrics for scoring
- Contain items/tasks that are
unambiguous to the test-takers.
RELIABILTY
Reliability
- The degree or extent to which an
assessment tool produces stable and
consistent results.
- Consistency, stability, dependability and
accuracy of the test results. (McMillan,
2001)
RELIABILTY
Test- Retest Reliability
- The same test is re-administered to the
same people.
- It is expected the correlation between the
two scores of the two tests would be high.
- The effect of practice and memory may
influence the correlation value
RELIABILTY
Inter-Rater Reliability
- Two or more judges or raters are involved
in grading.
- The score is more reliable and accurate
measure if two or more raters agree on it
or they assign similar results.
RELIABILTY
Intra-Rater Reliability
- The consistency of grading by a single
rater.
- When a rater grades tests at different
time, he/she may become inconsistent in
grading for various reasons.
RELIABILTY
Test Administration Reliability
- This involves the conditions in which the
test is administered.
- Unreliability may occur due to outside
interference including noise, variations in
photocopying, light and sound in different
parts of the room.
RELIABILTY
Factors affecting test reliability
 Test factor
 Teacher and student factor
 Environment factor
 Test administration factor
 Marking factor
RELIABILTY
1. Test factor
- Longer tests produce higher reliability
- Due to the dependency on coincidence
and guessing, the scores will be more
accurate it the duration of the test is
longer.
- An objective test has higher consistency
compared to a subjective test.
RELIABILTY
2. Teacher and student factor
- In most tests, the teachers normally construct
and administer tests for students.
- The teacher-student relationship would affect
the consistency of test result.
- Teacher’s encouragement, positive mental and
physical condition, familiarity to the test
formats could lead to higher consistency
RELIABILTY
3. Environment Factor
- An examination environment certainly
influence test-takers and their scores.
- Favorable environment will improve the
reliability of the test.
RELIABILTY
4. Test administration factor
- Students’ performance are dependent on
the way tests are administered (instruction,
time allowance, or careful monitoring of
tests).
RELIABILTY
5. Marking factor
- Human judges/raters have many
opportunities to introduce error in
scoring.
- Different raters may award different marks
for the same answer
VALIDITY
VALIDITY
The term VALIDITY is used to
refer to whether the test is
actually measuring what it
claims to measure (Arshad,
2004)
Test scores reflect the achievement of
validity
learning outcomes and test-taker’s
ability.

The test is valid when it reflects what


the learners can do in a language.
VALIDITY
validity
- Face validity
- Content validity
- Construct validity
1. Face Validity
validity
- A test looks like a test even at first
impression
- Mousavi (2009) refers face validity as
the degree to which a test looks right,
and appears to measure the
knowledge and abilities it claims to
measure.
2. Content Validity
validity
- Assessment of course content with
clear reference to goals and outcomes
- Use of formats and tasks familiar to
students
3. Construct validity
validity
- Refers to whether the underlying
theoretical constructs that the test
measures are themselves valid
3. Construct validity
validity
- Proficiency, communicative
competence, and fluency are example
of linguistic constructs; self-
confidence and motivation are
psychological constructs.
3. Construct validity
validity
- Grammar and Vocabulary – an essay or multiple-
choice?
- Reading – reading aloud or texts and comprehension
questions?
- Listening – a lecture or a series of dialogues?
- Writing ability – a dictation or a cover letter?
- Speaking – reading aloud tasks or face-to face
interviews?
 Does the test assess the skill (construct) that you focus
on in your class?
validity
 Does the test cover the content that you have been
teaching?
 Does the test look as if it is testing what it is supposed to
be testing?
 It is challenging / formal / adequate enough in the eyes
of the test-takers?
Put the following words into the correct
column
Construct Inter-rater Face Content Environment
factors
Intra-rater Consistency Curriculum Outcomes Test results

Reliability Validity
Reliability Validity
Inter-rater Construct
Intra-rater Face
Environment factors Content
Consistency Curriculum
Test results Outcomes
PRACTICALITY
PRACTICALITY “The logistical, down-to-earth administrative issues
involved in designing, admistering, and scoring.”

These include “costs and amount of time it takes to


construct and to administer, the ease of scoring,
and ease of reporting/interpreting results”
(Mousavi, 2009)
PRACTICALITY A PRACTICAL TEST
- Stay within budgetary limits.
- Can be completed by test takers within the
appropriate time constraints.
- Has clear direction for administration.
- Appropriately utilise the available human
resources.
- Does not exceed available material resources
- Considers the time and effort involved for both
designing and scoring
IMPRACTICAL!!!
• … a test which is prohibitively expensive

• …a test of language proficiency that would take students 10 hours to complete

• …a speaking test that requires individual 10 minutes one-to-one talk for a group of 50
test-takers and only one scorer;

• ……a test that takes students a few minutes to complete and several hours for the
examiner to prepare and/or correct

• …a test which can be scored only by computer in a location without easy access to
computers and internet connection
AUTHENTICITY
AUTHENTICITY A PRACTICAL TEST
”The degree of correspondence of the
characteristics of a given language test task to the
features of a target language task” (Bachman &
Palmer, 1996)

Language learners are more motivated to perform


when they are faced with tasks that reflect real
world situations and contexts.
AUTHENTICITY AN AUTHENTIC TEST
- Contain language that is as natural as possible.
- Has items that are contextualized rather than
isolated.
- Includes meaningful, relevant and interesting
topics
- Provides some thematic organization to items,
such as through a story line or episode.
- Offer tasks that replicate real-world tasks
AUTHENTICITY
AUTHENTICITY Let’s think about a
listening test!!!!

How can the test be made


more authentic???
AUTHENTICITY
- Different accents
- Hesitations and pauses
- Background noises
- Monologue – Dialogue
- Interesting topics
- Interuptions
WASHBACK
WASHBACK WASHBACK EFFECT
- “WASHBACK” or “BACKWASH” (Hughes, 2003)
refers to the impacts that tests have on teaching
and learning.

- Can have a positive or negative impact on the


teaching and learning process
WASHBACK POSITIVE WASHBACK

•Provide a qualification
•Provide motivation
On learners
•Serve as a revision tool
•Provide feedback
•Identify struggling learners in a class
On teachers •Diagnose common learner errors to
modify instruction
On teaching •Increase accountability of school
institutions •Identify weaknesses of a syllabus
and schools
•Encourage a balanced curriculum
WASHBACK POSSIBLE NEGATIVE WASHBACK

 Preparation for a test may take up teaching time.


 A test can be used as a way for teachers to exert
their authority.
 Learners only practice the things that they know
will be in the test, and ignore everything else.
 Learners feel stressed or nervous about the test
conditions, the results and their image.
WASHBACK POSSIBLE NEGATIVE WASHBACK
 Learners feel demotivated either by the prospect of
revising for the test or at the thought of getting low
marks.
 The way the test is marked may penalize errors rather
than give credit for what the learner has done correctly.
 Test results may cause a feeling of divisions within the
class.
 Improving test results can seem more important than
learning – this often means that the range of skills taught
becomes narrower.
TRANSPARENCY
TRANSPARENCY  Availability of information about assessment
 Information should include:
 what they have to do to succeed, outcomes
 expected content and format
 time allocated for task, deadlines
 Weighing of items or sections
 grading criteria
 useful feedback for improvement
SECURITY
Students:
Cheating, “collaborative” test-taking,
SECURITY
plagiarism or any other kind of intellectual
dishonesty is forbidden
Staff:
There are clear security guidelines for all
stages of assessment that must be
followed
There are severe consequences for
breaches of security.
PRACTICE
Handout: In the handout, you find a description of the Preliminary English Test (PET -
Level B1) for Speaking skills, the test procedures and guidelines, the sample speaking
test, and the speaking assessment scale.

Purpose: The test is intended to be used as a speaking test in the National examination
for high school students. During years at high school, the course book issued by the
Ministry of Education and Training is used.

Test adminstration: In each test location, there are about 500 Grade 12th students
whose expected level of proficiency is B1. There are about 30 examiners invited to be
raters and they are given a one-day training course on the assessment scale.

2 months prior to the test day, information about the test and its format is available on
the Website of Minstry of Education and Training. Information is also circulated to
highschools throughout the country.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy