0% found this document useful (0 votes)
36 views6 pages

U3 - Characteristic of A Good Test

The document discusses the characteristics of a good test, including validity, reliability, and the relationship between reliability and validity. It defines different types of validity such as face validity, content validity, construct validity, and criterion-related validity. It also discusses how to ensure reliability in testing and the importance of both reliability and validity for accurate measurement.

Uploaded by

Sang Lê
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views6 pages

U3 - Characteristic of A Good Test

The document discusses the characteristics of a good test, including validity, reliability, and the relationship between reliability and validity. It defines different types of validity such as face validity, content validity, construct validity, and criterion-related validity. It also discusses how to ensure reliability in testing and the importance of both reliability and validity for accurate measurement.

Uploaded by

Sang Lê
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

10/5/2021

Chapter 3: Characteristics of a good test


1. Validity
1.1. Face validity
1.2. Content validity
1.3. Construct validity
1.4. Criterion-related validity
1.5. Validity in scoring
1.6. How to make tests more valid?
2. Reliability
3. Reliability and Validity

1. Validity
A test is said to be valid if it measures accurately
what it is intended to measure

Face Content

VALIDITY Construct

in Criterion-
scoring related

1.1. Face validity

• A test is said to have face validity if it looks as if it


measures what it is intended to measure.
• e.g., A reading test looks like a grammar and
vocabulary test  No face validity

1
10/5/2021

1.2. Content validity

 A test is said to have content validity if its content


contains a representative sample of the language
skills, structures, etc., with which it is meant to be
concerned.
E.g., An achievement test for intermediate learners contains a
set of lexis and structures for the intermediate level, rather
than for the advanced level.

 In order to judge whether a test has content validity,


we need a specification of the skills or structures
etc. that it is meant to cover.

1.2. Content validity


e.g., SPECIFICATION
READING SUBSKILLS
1. Skimming/reading for gist
• Identifying text topic 20%
• Identifying text purpose 20%
2. Scanning/reading for detail
• Finding specific details (e.g. figures, dates etc.) 20%
3. Understanding the text
• Locating, identifying, understanding and comparing facts 10%
• Understanding relationships among ideas in a text 10%
• Making inferences 10%
4. Understanding lexis
• Predicting the meaning of words from the context 10%

1.2. Content validity


• The greater a test’s content validity, the more likely
it is to be an accurate measure of what it is supposed
to measure (i.e., construct validity).

• Without content validity, areas that are not tested are


likely to become areas ignored in teaching & learning
(i.e., negative backwash).

2
10/5/2021

1.3. Construct validity


• A test is said to have construct validity if it can be
demonstrated that it measures just the ability which it
is supposed to measure.
• This type of validity assumes the existence of certain
learning theories or constructs underlying the
acquisition of abilities and skills.

What is speaking?
Non-verbal ideas  Grammar & Vocabulary & Pronunciation 
Verbal production  Self-monitoring
Task A
Read aloud the following sentences
1. I admired Mr. Jones because he was a hero to us.
2. He was in the national water polo team.
3. He encouraged us to do our best in sports.
Task B (measures the speaking skill, not the pronunciation ability in
Task A or the ability to generate ideas before speaking in Task C)
Talk about a person from your childhood whom you admired. You
should mention
• Your relationship to him or her
• What he or she did
• What you admired about this person
Task C
• Talk about a person from your childhood whom you admired.

1.4. Criterion-related validity


• A test is said to have criterion validity if its results
agree with the results of some dependable
assessment/criterion.
• E.g.
Students who have an IELTS band score of 4.0 are
likely to pass the national graduation English exam
(Graduation L2 exams use IELTS 4.0 as a criterion).

3
10/5/2021

1.5. Validity in scoring


• The test should be marked in accordance with what it
is intended to test.
• E.g.,
Listening test
Answer key: went to supermarket (0.5 point)
Student A’s answer: went to suppermarket.
 How would you score student A’s answer?

1.6. How to make tests more valid


1. Write explicit specifications for a test to ensure its
content validity.
2. Use direct testing. If indirect testing is used, make
reference to the research literature to confirm the
relevant underlying constructs of the testing
techniques used in the test.
3. Validate the test against some criterion.
4. Ensure that the scoring of responses relates directly
to what is being tested.

2. Reliability
• The extent to which a test is consistent; i.e. under
the same condition and with the same performance
of students, our assessment produces the same or at
least similar results.

Same test
Same students
Same results

Different times

4
10/5/2021

2. Reliability
Example
Scores on test A 1st time 2nd time
Mary 68 82
Bill 46 28
Ann 19 34

Scores on test B 1st time 2nd time


Mary 65 69
Bill 46 50
Ann 27 25

 Test B is more reliable

2. Reliability
Scorer reliability:
• The level of agreement given by the same or
different scorers/raters on different occasions.

2. Reliability
How to make tests more reliable: [you need to be able to provide an
explanation for any of the following statements when asked ([1]: 36-42)]
1. Provide uniform and non-distracting conditions of administration.
2. Make students familiar with format and testing techniques.
3. Ensure that tests are well laid out and perfectly legible.
4. Provide clear and explicit instructions.
5. Write unambiguous items.
6. Take enough samples of behavior.
7. Do not allow candidates too much freedom.
8. Use items that permit scoring which is as objective as possible.
9. Exclude items which do not discriminate well between weaker and stronger
students.
10. Train scorers.
11. Identify candidates by number, not name.
12. Employ independent scoring.
13. Provide a detailed scoring key.
14. Agree acceptable responses.

5
10/5/2021

3. Reliability and Validity


• If a test is not reliable  inconsistent results 
unable to accurately measure what it has to measure
 the test is not valid.
• If a test is reliable, it can be valid or not valid (e.g., its
content or construct validity is invalid)

Study guide
1. What is validity? How many types of validity? What
are they? Give examples. How can we make tests
more valid?
2. What is reliability? Give examples.
What is scorer reliability? How can we make tests
more reliable?
3. What is the relationship between reliability and
validity?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy