Lesson 5 Construction of Written Tests
Lesson 5 Construction of Written Tests
LEARNING OUTCOMES
1. What are the general guidelines in choosing the appropriate test format?
2. What are the major categories and formats of traditional tests?
3. What are the general guidelines in writing multiple-choice test items?
4. What are the general guidelines in writing matching-type items?
5. What are the general guidelines in writing true or false items?
6. What are the general guidelines in writing short-answer test items?
7. What are the general guidelines in writing essay tests?
8. What are the general guidelines in problem-solving test items?
KEY CONCEPT
To learn or enhance your skills in developing good and effective test items for a particular test
format, you need to go back and review your prior knowledge on different test formats; how and
when to choose a particular test format that is the most appropriate measure of the identified
learning objectives and desired learning outcomes of your subject; and how to construct good
and effective items for each format.
To guide you in choosing the appropriate test format and designing fair and appropriate yet
challenging tests, you should ask the following important questions:
3. Is the test matched or aligned with course’s DLOs and the course contents or
learning activities?
It is important that you are clear about what DLOs are to be addressed by your
test and what course activities or tasks are to be implemented to achieve the
DLOs.
For example, if you want learners to articulate and justify their stand on ethical
decision-making and social responsibility practices in business (DLO), then an
essay test and class debate are appropriate measures and tasks for this
learning outcome.
A multiple choice test may be used but only if you intend to assess learners’
ability to recognize what is ethical versus unethical decision-making practice.
In the same manner, matching-type items may be appropriate if you want to
know whether your students can differentiate and match the different approaches
or terms ti their definitions.
1) Selected-Response Type – learners select the correct response from the given
options
2) Constructed-response Type – learners are asked to formulate their own answers.
The cognitive capabilities required to answer selected-response items are different from those
reconstructed-Response quired by constructed-response items, regardless of content.
Writing multiple-choice test items require content mastery, writing skills and time. only
good and effective items should be included in the test. Poorly written test items could be
confusing and frustrating to learners and yield test scores that are not appropriate to evaluate
their learning achievement
The following are the general guidelines in writing good multiple-choice items. They are
classified in terms of content, stem, and options.
Content:
1. Write items that reflect only one specific content and cognitive processing skills.
2. Do not lift and use statement from the textbook or other learning materials as test
questions.
3. Keep the vocabulary simple and understandable based on the level of
learners/examinees.
4. Edit and proofread the items for grammatical and spelling before administering them to
the learners.
Stem
Faulty: Read each question and indicate your answer by shading the circle
corresponding to your answer.
Good. This test consists of two parts. Part A is a reading comprehension test
and Part B is a grammar/language test. Each question is a
multiple choice test item with five (5) options. You are to answer
each question but will not be penalized for a wrong answer or for guessing.
You can go back and review your answers during the time allotted.
2. Write stems that are consistent in form and structure, that is, present all items either in
question form or in descriptive or declarative form.
Faulty 1) Who was the Philippine president during Martial Law?
2) The first president of the Commonwealth of the Philippines was
_______.
Good
1) Who was the Philippine president during Martial Law?
2) Who was the first president of the Commonwealth of the Philippines?
3. Word the stem positively and avoid double negatives, such as NOT and EXCEPT in a
stem. If a negative word is necessary, underline or capitalize the words for emphasis.
Faulty
Which of the following is not a measure of variability?
Good
Which of the following is NOT a measure of variability?
4. Refrain from making the stem too wordy or containing too much information unless the
problem/question requires the facts presented to solve the problem.
Faulty
What does DNA stand for, and what is the organic chemical of omplex molecular
structure found in all cells and viruses and codes genetic information for the
transmission of inherited traits?
Good
As a chemical compound, what does DNA stand for?
Options
1. Provide three (3) to five (5) options per item, with only one being the correct or best
answer/alternative.
2. Write options that are parallel or similar in form ad length to avoid giving clues about
the correct answer.
5. Use None-of-the-above carefully and only when there is one absolutely correct
answer, such as in spelling or math items.
Faulty: Which of the following is a nonparametric statistic?
a. ANOVA c. t-test
b. ANCOVA d. None of the above
c. Correlation
a. ANCOVA d. Mann-Whitnet U
b. ANOVA e. t-test
c. Correlation
6. Avoid All-of-the-above as an option especially if it is intended to be the correct
answer
Faulty: Who among the following has become the President of the Philippine
Senate?
a. Ferdinand Marcos d. Quintin Paredes
b. Manuel Quezon e. All of the above
c. Manuel Roxas
Good: Who was the first ever President of the Philippine Senate?
a. Eulogio Rodriguez d. Manuel Roxas
b. Ferdinand Marcos e. Quintin Paredes
c. Manuel Quezon
The matching test item format requires learners to match a word, sentence, or phrase in
one column (premise) to a corresponding word, sentence, or phrase in a second column
(response). It is most appropriate when you need to measure the learners’ ability to identify the
relationship or association between similar items. They work best when the course content has
many parallel concepts. While matching type test format is generally used for simple recall of
information, you can find ways to make it applicable or useful in assessing higher level of
thinking such as applying and analyzing.
The following are the general guidelines in writing good and effective matching-type
tests:
1. Clearly state in the directions the basis for matching the stimuli with the responses.
Item # 1’s instruction is less preferred as it does not detail the basis for matching
the stem and the response options.
2. Ensure that the stimuli are longer and ther responses are shorter
A B
_____ Bangladesh A. Green background with red circle in the center
_____ Indonesia B. One red strip on top and white strip at the bottom
_____ Japan C. Red background with white five-petal flower in the
center
_____ Singapore D. Red background with large yellow circle in the center
_____ Thailand E. Red background with large yellow pointed star in the
center
F. White background with large red circle in the center
Item # 2 is a better version because the descriptions are presented in the first column
while the response options are in the second column. The stems are also longer than the
options.
3. For each item, include only topics that are related with one aother and share the same
foundation of information
Good: On the line to the left of each country in Column I, write the letter of the
Item # 1 is considered unacceptable item because its response options are not
parallel and include different kinds of information that can provide clues to the
correct/wrong answers. On the other hand, item # 2 details the basis for matching
and the response options only include related concepts.
4. Make the response options short, homogeneous and arranged in logical order.
A B
_______ Gold A. Au
_______ Hydrogen B. Magnetic metal used in steel
_______ Iron C. Hg
_______ Potassium D. K
_______ Sodium E. With lowest density
F. Na
A B
_______ Gold A. Au
_______ Hydrogen B. Fe
_______ Iron C. Hg
_______ Potassium D. K
_______ Sodium E. H
F. Na
Item # 1, response options are not parallel in content and length. They are not
also arranged alphabetically.
5. Include response options that are reasonable and realistic and similar in length and
grammatical form
A B
______ History A. Studies the production and distribution of good/services
______ Political Science B. Study of politics and power
______ Psychology C. Study of society
______ Sociology D. Understands role of mental functions in social behavior
E. Uses narratives to examine and analyze past events
Good: Match the subjects with their course description.
A B
1. Study of living things A. Biology
2. Study of mind and behavior B. History
3. Study of politics and power C. Political Science
4. Study of recorded events in the past D. Psychology
5. Study of society E. Sociology
F. Zoology
Item # 1 is less preferred because the response options are not consistent in terms
of their length and grammatical form.
A B
_________ ¼ A. 0.25
_________ 5/4 B. 0.28
_________ 7/25 C. 0.90
_________ 9/10 D. 1.25
A B
_________ ¼ A. 0.09
_________ 5/4 B. 0.25
_________ 7/25 C. 0.28
_________ 9/10 D. 0.90
E. 1.25
True or false items are used to measure learners’ ability to identify whether a statement
or proposition is correct/true or incorrect/false. They are best used when learners’ ability to
judge or evaluate is one of the desired learning outcomes of the course.
There are different variations of the true or false items. These include the following:
In this format, the statement is presented with a key word or phrase that is underlined,
and the learner has to supply the correct word or phrase.
Multiple-Choice test is authentic.
2. Yes – No Variation
In this format, the learner has to choose yes or no rather than true or false.
e.g. The following are kinds of test. Circle Yes if it is an authentic test and No if not.
3. A – B Variation
In this format, the learner has to choose A or B, rather than true or false.
e.g. Indicate which of the following are traditional or authentic tests by circling A if it is a
traditional test and B if it is authentic.
Traditional Authentic
Multiple-Choice A B
Debates A B
End-of-the Term Project A B
True or False Test A B
Because true or false items are prone to guessing, as learners are asked to choose between
two options, utmost care should be exercised in writing true or false items. The following are the
general guidelines in writing true or false items:
Good: The presidential system, where the president is only the head of state or
government, is adopted by Chile.
Item # 1 is of poor quality because while the description is right, the countries given are
not all correct. While South Korea has a presidential system of government, it also has a
prime minister who governs alongside with the president.
Faulty:
Education is a continuous process of higher adjustment for human beings who have
evolved physically and mentally, which is free and conscious of God, as manifested
in nature around the intellectual, emotional, and humanity of man.
Good:
Education is the process of facilitating learning or the acquisition of knowledge, skills,
values, beliefs, and habits.
Item # 1 is somewhat confusing, especially for younger learners because there are many
ideas in one statement.
Faulty: There is nothing illegal about buying goods through the internet.
Good: It is legal to buy things or goods through the internet.
Double negatives are sometimes confusing and could result in wrong answers, not
because the learner does not know the answer but because of how the test are
presented.
Faulty: The news and information posted on the CNN Website is always accurate.
Good: The news and information posted on the CNN websites is usually accurate.
Absolute words such as “always” and “never” restrict possibilities and make a
statement as true 100 percent or all the time. They are also a hint for a “false” answer.
Students may have a difficult time understanding the statement, especially if the word
“esprit de corps” has not been discussed in the class. Using unfamiliar words would
likely lead to guessing.
7. Avoid lifting statements from the textbook and other learning materials.
In item # 1, the word “core” is not the significant word. The item is also prone to
many and varied interpretations resulting to many possible answers.
2. Do not omit too many words from the statement such that the intended meaning is lost.
Faulty: ______ is to Spain as the ______ is to United States and as ______ is to
Germany
Good: Madrid is to Spain as the _______ is to France.
Item # 1 is prone to many and varied answers. For example, a student may answer
the question based on the capital of these countries or based on what continent they
are located.
Item # 2 is preferred because it is more specific and requires only one correct
answer.
Item # 1 already gives a clue that Ferdinand Marcos was the president during this
time because only the present of a country can declare martial law.
Item # 1 has many possible answers because the statement is very general (e.g.
wind, solar, biomass, geothermal, and hydroelectric)
Item # 2 is more specific and only requires one correct answer. (wind)
The word “an” in item # 1 provides a clue that the correct answer starts with a vowel.
6. If possible, put the blank at the end of a statement rather thatn at the beginning.
Faulty: _____ is the basic building block of matter.
Good: The basic building block of matter is ___________.
In item # 1, learners may need to read the sentence until the end before they can
recognize the problem, and then re-read it again and then answer the question.
In Item # 2, learners can already identify the context of the problem by reading
through the sentence only once and without having to go back and re-read the
sentence.
Teachers generally choose and employ essay tests over other forms of assessment
because essay tests require learners to create a response rather than to simply select a
response from among alternatives.
They prefer essay if they want to measure learners’ higher order thinking skills,
particularly their ability to reason, analyze, synthesize and evaluate.
Essay assess learners’ writing abilities
It is most appropriate for assessing learners’ :
1) Understanding of the subject-matter content
2) Ability to reason with their knowledge of the subject
3) Problem-solving and decision skills because items or situation presented in the
test are authentic or close to real life experience.
Essay tests are used because items or situations presented in the test are authentic or close
to real life experiences
There are two types of essay tests
1) Extended-response essay
2) Restricted-response essay
Extended-Response Restricted-Response
The following are the general guidelines in constructing good essay questions:
1. Clearly define the intended learning outcome to be assessed by the essay test.
Identify and specify the intended learning outcomes
Appropriate direct verbs that most closely match the ability that learners should
demonstrate must be used in the prompts
Preferably use verbs such as: compose, analyze, interpret, explain, justify,etc.
2. Refrain from using essay test for intended learning outcomes that are better
assessed by other kinds of assessment
Some intended learning outcomes can be efficiently and reliably assessed by
selected-type test rather than by essay test.
There are intended learning outcomes that are better assessed during other
authentic assessments, such as performance test, rather than essay test
3. Clearly define and situate the task within a problem situation as well as the type of
thinking required to answer the test.
Essay questions or prompts should provide clear and well-defined tasks to the
learners.
It is important to carefully choose the directive verb, to write the clearly the object
or focus of the directive verb, and to delimit the scope of the task.to avoid
responses that contain ideas that are unrelated or irrelevant, too long, or focusing
only on some part of the task.
Emphasizing the type of thinking required to answer the question will also guide
students on the extent to which they should be creative, deep, complex, and
analytical in addressing and responding to the questions.
4. Present tasks that are fair, reasonable, and realistic to the students.
Essay questions should contain tasks or questions that students will be able to
do or address.
These include those that are within the level of instruction/training, expertise, and
experience of the students.
5. Be specific in the prompts about the time allotment and criteria for grading the
response
Essay prompts and directions should indicate the approximate time given to the
students to answer the essay questions to guide them on how much time they
should allocate for each item, especially if several essay questions are
presented.
How the responses are to be graded or rated should also be clarified to guide the
students on what to include in their responses.
Problem-solving test items are used to measure learner’s ability to solve problems that
require quantitative knowledge and competencies and/or critical thinking skills. These items
present a problem situation or task that will require learners to demonstrate work procedure or
come up with a correct solution. Full or partial credit can be assigned to the answers, depending
on the answers or solutions required.
There are different variations of the quantitative problem-solving items which include the
following:
This type of question has four or five options, and students are required to choose all of
the options that are correct.
Example: Consider the following score distribution: 12, 14, 14, 14, 17, 24, 27,
28, 30. Which of the following is/are the correct measure/s of central tendency? Indicate
all possible answers.
A. Mean = 20 D. Mean = 17
B. Mean = 22 E. Mean = 14
C. Mean = 16 Options A, D, and E are all correct answers
3. Type-in Answer
This type of question does not provide options to choose from. Instead, the
learners are asked to supply the correct answer.
The teacher should inform the learners at the start how their answers will be
rated.
For example: the teacher may require just the correct answer or may require
learners to present the step-by-step procedures in coming up their answers.
For non-mathematical problem-solving, such as a case study, the teacher may
present a rubric how their answers will be rated
Example: Compute the mean of the following score distribution: 32, 44, 56, 69,
75, 77, 95, 96. Indicate your answer in the blank provided.
In this case, the learners will only need to give the correct answer without having
to show the procedures for computation.
The following are some of the general guidelines in constructing good problem-solving test
items:
Faulty: Tricia was 135.6 lbs. when she started with her zumba/aerobics exercises. After
three months of attending the sessions three times a week, he weight
was down to 122.8 lbs. about how many lbs. did she lose after three
months? Write your final answer in the space provided and show your
computations.
(this question asks “about how many” and does not indicate whether learners need
to give the exact weight or whether they need to round off their answer and to what
extent)
Good: Tricia was 135.6 lbs. when she started with her zumba/aerobics exercises. After
three months of attending the sessions three times a week, her weight
was down to 122.8 lbs. How many lbs. did she lose after three months? Write
your final answer in the space provided and show your computations. Write
the exact weight; do not round off.
2. Be specific and clear of the type of response required from the students.
Faulty: ASEANA Bottlers, Inc. has been producing and selling Tutti Fruity juice in the
Philippines, aside from their Singapore market. The sales for the juice in
the Singapore market were $5 million more than those of their
Philippine market in 2016, $ 3million more in 2017, and $ 4.5 million in
2018. If the sales in Philippine market in 2018 were Php 35 million, what
were the sales in Singapore market during that year?
[This is a faulty question because it does not specify in what currency should the
answer be presented]
Good: ASEANA Bottlers Inc.. has been producing and selling Tutti Fruiti juice in the
Philippines, aside from their Singapore market. The sales for the juice in
the Singapore market were $5 million more than those of their
Philippine market in 2016, $ 3 million more in 2017, and $4.5 million in
2018. If the sales in Mexican market in 2018 was Php 35 million what
were the sales in U.S. market during that year? Provide answer in Singapore
dollars (1$ = Php 36.50).
[this is a better item because it specifies in what currency should the answer be
presented and the exchange rate was given.]
Faulty: VCV Consultancy Firm was commissioned to conduct a survey on the voters’
preferences in Visayas and Mindanao for the upcoming presidential election. In Visayas,
65% are for Liberal Party (LP) candidate, while 35% are for the Nationalista Party (NP)
candidate. In Mindanao, 70% of the voters are Nalionalists, while 30% are LP supporters. A
survey was conducted among 200 voters for each region. What is the probability that the
survey will shoe a greater percentage of Liberal Party supporters in Mindanao than in the
Visayas region?
[This question is undesirable because it does not specify the basis for grading the
answer]
Good: VCV Consultancy Firm was commissioned tyo conduct a survey on voters’
preference in Visayas and Mindanao for the upcoming presidential election. In Visayas,
65% are for Liberal Party (LP) candidate, while 35% are for the Nationalista Party (NP)
candidate. In Mindanao, 70% of the voters are Nalionalists while 30% are LP supporters. A
survey was conducted among 200 voters for each region.
What is the probability that the survey will show a greater percentage of Liberal Party
supporters in Mindanao than in the Visayas region? Please show your solutions to support
your answer. Your answer will be graded as follows:
SCQ
Answer the following
questions
Multiple-choice test
Matching-type test
True or false test
Short-answer test
Essay test
Problem-solving test