Chapter 4-6 Notes
Chapter 4-6 Notes
Overview
For much of the history of language teaching and assessment, it has been common to distinguish between
"general" and "specific purpose" language courses and tests. General purpose courses and tests refer to
situations in which the purposes for learning cannot be specified with much certainty, while specific
purpose courses and tests are designed for situations where the purposes can be clearly defined.
General Purpose Language Tests
These tests are often used in contexts where the exact use of the language cannot be predetermined. For
example, learning French for cultural purposes or Spanish for general conversation are scenarios where
the specific needs of the learner are not narrowly defined. In these cases, the focus is on broad language
skills that are applicable in a wide range of contexts.
Specific Purpose Language Tests
These tests are designed for particular contexts where the language needs are clearly defined. Examples
include English for academic purposes, Business German, or Chinese for health workers. These tests are
tailored to assess the language skills necessary for specific tasks or professional situations.
Blurring Lines
In recent years, the distinction between general and specific purpose language tests has become less clear.
The process for developing a language test, which includes defining the purpose of the test, conducting a
preliminary investigation, collecting primary and secondary language data, and analyzing the target
communicative tasks and language, is applicable to both general and specific purpose tests.
Test Development Procedure
The procedure for test development involves several steps:
1. Defining the Purpose of the Test: Understanding what the test aims to measure and why (the aim
is to give admission to potential students).
2. Preliminary Investigation: Gathering information about the context and needs (conducting a
survey to accumulate the necessary data which will help identify students' background
knowledge).
3. Collecting Language Data: Obtaining examples of the language in use (the actual data you get).
4. Analyzing Target Tasks: Understanding the specific tasks that test-takers will need to perform
(analyzing the accumulated data).
5. Developing Test Tasks: Creating test items that reflect real-world tasks and language use
(developing a test based on the data you have).
This approach ensures that the test is relevant and accurately measures the language skills needed for the
specified purposes.
Making Inferences
Language test results are used to make inferences about a person's language abilities concerning some
purpose, whether it's assigning grades, determining proficiency for study abroad programs, or assessing
job readiness. This highlights the importance of aligning the test with the intended use of the results.
Evolving Perspectives
The theoretical distinction between specific purpose language teaching as "training" (focused on a
restricted linguistic code for a specific context) and general purpose language teaching as "education"
(aimed at providing the ability to respond to various communicative situations) is no longer seen as
viable. Both approaches aim to equip learners with the language skills necessary to meet their specific or
general needs effectively.
Discrete-Point Tests
These tests focus on one small piece of language at a time, such as a single vocabulary
word or a specific grammar rule.
Example:
Imagine a quiz where you are asked to fill in the blank in the sentence: "She ___ to the
store yesterday." Options might be "go," "goes," "went," and "going." The correct answer
is "went," and this question is only testing your knowledge of the past tense form of the
verb "to go."
They help pinpoint exactly where a learner might be struggling with specific parts of the
language.
Integrative Tests
These tests require you to use multiple language skills at the same time. They often
involve more complex tasks that mirror real-life language use, such as writing an essay or
understanding a long reading passage.
Example:
Imagine you read a story and then answer questions about it, like "What is the main idea
of the story?" This type of test checks if you can understand the overall meaning, connect
different ideas, and use your vocabulary and grammar knowledge together.
They give a better picture of how well you can use the language in real-life situations,
where you need to combine different skills.
Quick Comparison
Discrete-Point Tests: Focus on one small part of language at a time (like a quiz on
vocabulary or grammar).
Integrative Tests: Require using multiple language skills together (like reading a passage
and answering questions about it).
Examples:
Vocabulary question: "What does 'happy' mean?" (a. sad, b. joyful, c. angry, d. tired)
Reading and writing task: "Read a short article about healthy eating. Then, write a
summary of the article and explain why healthy eating is important."
Both types of tests are useful in different ways. Discrete-point tests help identify specific areas
that need improvement, while integrative tests show how well you can use the language in real-
world contexts.
Formative Assessment
What is it?
Formative assessment is like a check-up during the learning process to see how well
students are understanding the material.
Examples:
Quizzes
Class discussions
Homework assignments
Observations
In-class activities
To give feedback to students so they know what they need to work on.
To help teachers adjust their teaching methods if needed.
It's like a coach giving tips during practice to help players improve before the big game.
Key Points:
Summative Assessment
What is it?
Summative assessment is like a final check at the end of a learning period to see how
much students have learned.
Examples:
Final exams
End-of-term projects
Standardized tests
Final papers
Key Points:
Quick Comparison
Formative Assessment:
o Ongoing checks during learning.
o Helps improve learning and teaching.
o Like practice and feedback.
Summative Assessment:
o Final check at the end.
o Measures overall learning.
o Like a final game or test.
Examples:
A teacher gives a short quiz halfway through a unit to see if students understand the key
concepts. Based on the results, the teacher might review some topics again or provide
extra help to those who need it.
At the end of the semester, students take a final exam covering everything they learned.
The exam results show how well each student has understood the entire course material.
Both formative and summative assessments are important. Formative assessments help guide
learning and make adjustments along the way, while summative assessments evaluate the final
learning outcomes.
4.6.1 Conference Assessments
Conference assessments involve meetings between the teacher and the student to discuss the
student's progress, performance, and areas for improvement. This approach is more interactive
and provides an opportunity for personalized feedback. It helps in understanding the student's
needs better and planning future learning strategies.
Portfolio assessment is the evaluation of a student's collection of work over a period of time.
This can include drafts, completed tasks, reflections, and any other evidence of learning. It
provides a comprehensive view of the student's progress and achievements, emphasizing growth
and development rather than a single test score.
Self-assessments allow students to evaluate their own performance and learning process. This
encourages self-reflection and helps students take responsibility for their learning.
Peer-assessments involve students evaluating each other’s work. This can promote collaborative
learning and provide diverse feedback. Both methods help develop critical thinking and
evaluative skills.
Task-based assessments involve students completing tasks that reflect real-world language use.
These tasks can range from writing an email to conducting an interview.
Performance assessments require students to perform specific tasks, demonstrating their ability
to use language effectively in practical situations. This approach is highly practical and often
considered more valid in assessing language proficiency.
Dynamic assessment integrates teaching and assessment. It focuses on the learning process and
helps identify the student's potential through interactive and adaptive testing. This method
emphasizes the development of skills and knowledge through guided interaction, allowing for a
more individualized assessment approach.
Chapter 5
Average or Mean:
Calculating the mean is fairly straightforward: we simply add up all the scores2 on the test and divide the
total by the number of test takers.
Question: A student recorded the following scores in their math quizzes: 78, 85, 62, 90, 56, 92, 88, 72,
75, 84, 91, 69, 74, 82, 79, 64, 81, 77, 66, 87, 93, 68, 83, 86, 73, 89, 70, 71, 65, 80. Calculate the mean
score of the student.
Solution: To find the mean score, sum all the scores and then divide by the number of scores.
1. Sum of all scores:
78+85+62+90+56+92+88+72+75+84+91+69+74+82+79+64+81+77+66+87+93+68+83+86+73+
89+70+71+65+80=233078 + 85 + 62 + 90 + 56 + 92 + 88 + 72 + 75 + 84 + 91 + 69 + 74 + 82 +
79 + 64 + 81 + 77 + 66 + 87 + 93 + 68 + 83 + 86 + 73 + 89 + 70 + 71 + 65 + 80 =
233078+85+62+90+56+92+88+72+75+84+91+69+74+82+79+64+81+77+66+87+93+68+83+86
+73+89+70+71+65+80=2330
2. Number of scores: 30
Mean score = 2330/30
So, the mean score is approximately 77.67.
Standard Deviation
Since the standard deviation is just an average, its calculation is in principle no more complicated than
calculating any average: just subtract the mean from each score, add up the amount of difference, and
divide by the number of scores.
Steps to Solve
1. Calculate the Mean (Average)
2. Calculate Each Score's Deviation from the Mean
3. Square Each Deviation
4. Calculate the Mean of the Squared Deviations (Variance)
5. Take the Square Root of the Variance (Standard Deviation)
Standard Deviation as a Unit
To apply this concept to a practical situation, let us consider performance on the International English
Language Testing System (IELTS) Academic Reading module and the TOEFL Reading section. The
IELTS reading scores are reported on a nine band scale (from Band 1 Non User to Band 9 Expert User),
while the TOEFL reading scores are reported on a 30-point scale. The IELTS reading mean is Band 6,
with a standard deviation of 1, while the TOEFL reading mean is 18, with a standard devia tion of 8.4
Suppose a student scored at Band 7 on the IELTS reading test and got a 22 on the TOEFL reading test.
Which score is better? By using the standard deviations as guides, we can say that the TOEFL score is a
half standard deviation above the TOEFL mean (i.e. the score is four points above the mean, which is half
the standard deviation of 8), while the IELTS score is one standard deviation above the mean on that test
(i.e. Band 7 is one point above the IELTS mean, or one standard deviation), so the IELTS performance is
better than that on the TOEFL, with respect to the people who took each test.
Corelation
Fundamentally, correlation simply refers to an association between two events or facts and refers specifi
cally to the strength of the relationship. In language or test performance, correlation is based on the
assumption that when people perform similarly on two different tasks, similar abilities must be required
for the perform ances. In other words, the idea is that there is some overlap in the abilities required for
performance on two tasks and correlation is really just a number indicating the amount of overlap in
performances due to similar underlying abilities, as shown in Figure 5.4, below.
Spearman rank-order correlation
The result, .84, indicates the degree of overlap in the abilities required to perform on the two reading
tests. A value of 1.0 would indicate a perfect relationship between performance on the two tests; a value
of 0.0 no relationship at all6. O ur result of .84 suggests a fairly strong relationship. A result of .90 would
be stronger still, while .80 would be weaker.
The T-Test of Averages Between Two Tests
Let's tackle each of the problems one by one.
#### Scores:
```
25, 25, 24, 21, 21, 21, 20, 20, 19, 18, 16, 16, 15, 13, 13, 12, 12, 11, 11, 9
```
where:
- \( k \) = number of items (25 in this case)
- \( M \) = mean score
- \( s^2 \) = variance
### Calculations
```python
# Calculate the standard deviation
std_dev = np.std(scores, ddof=1) # ddof=1 for sample standard deviation
std_dev
```
```python
# Number of items (k)
k = 25
# Variance (s^2)
variance = std_dev**2
# Calculate KR-21
KR_21 = (k / (k - 1)) * (1 - (mean_score * (k - mean_score