Development of Varied Assessment Tools
Development of Varied Assessment Tools
LEARNING OUTCOME
Introduction
Development of paper-and-pencil tests requires careful planning and expertise in terms of actual
test-construction. The more seasoned teachers can produce true-false items that can test eve higher order
thinking skills and not just rote memory learning. Essays are easier to construct than the other types of
objective test but the difficulty with which paper-and-pencil tests grades are derived from essay
examinations often discourage teachers from using this particular form of examination in actual practice.
"Subject-Verb Agreement in English" for grade V class the following are typical objectives.
Knowledge / Remembering. The students must be able to identify subject and verb in a given
sentence.
1
Comprehension / Understanding. The students must be able to determine the appropriate form
of a verb to be used given the subject of a sentence.
Application / Applying. The students must be able to write sentences observing rules on subject-
verb agreement.
Analysis / Analyzing. The students must be able to break down a given sentence into its subject
and predicate.
Evaluation / Evaluating. The students must be able to evaluate whether or not a sentence
observes rules on subject-verb agreement.
Synthesis / Creating. The students must be able to formulate rules to be followed regarding
subject-verb agreement.
Deciding on the type of objective test. The test objectives guide the kind of objective test that will
be designed and constructed by the teacher. For instance, for first four (4) levels, we may want to
construct a multiple-choice type of test while for application and judgement, we may opt to give an essay
test or modified essay test.
Preparing a table of specifications (TOS). A Table of Specifications or TOS is a test map that
guides a teacher in constructing a test. The TOS ensures that there is balance between items that test
lower level thinking skills and choose which test higher order thinking skills (or alternatively, a balance
between easy and difficult items) in the test. The simplest TOS consists of four (4) columns: (a) level of
objective to be tested, (b) statement of objective, (c) item numbers where such an objective is being
tested, and (d) number of items and percentage out of the total for that particular objective. A prototype
table is shown below.
ITEM NO.
LEVEL OBJECTIVE %
NUMBERS
1. Knowledge Identify subject-verb 1,3,7,8,9 5 14.29%
2. Comprehension Form appropriate verb forms 2,4,6,8,10 5 14.29%
Write sentences observing rules on
3. Application 11,13,15,17,19 5 14.29%
subject-verb agreement
4. Analysis Determine subject and predicate 12,15,18,21,23 5 14.29%
Evaluate whether or not a sentence
5. Evaluation observes rules on subject-verb 13,16,19,22,24 5 14.29%
agreement
Formulate rules on subject-verb Part ll 10pts 28.57%
6. Synthesis
agreement
TOTAL 35
100%
In the table of Specifications we see That there are five items that deal with knowledge and these
items are 1,3,5,7,9. Similarly, from the same table we see that five items represents analysis, namely:
11,15,18,21,23. The first five levels of Bloom's taxonomy are equally represented in the test while
2
synthesis (tested trough essay) is weighted equivalent to ten (10) points or double the weight given to any
of the first four levels. The table of Specifications guides the teacher in formulating the test as we can see,
the TOS also ensures that each of the objectives in the hierarchy of educational objectives is well
represented in the test. As such, the resulting test that will be constructed by the teacher will be more or
less comprehensive. Without the Table of Specifications, the tendency for the test maker is too focused
too much on facts and concepts at the knowledge level.
Constructing the test items. The actual construction of the test items follows the TOS. As a
general rule, it is advised that the actual number of items to be constructed in the draft should be double
the desired number of items, for instance if there are five (5) knowledge level items to be included in the
final test form, then at least ten (10) knowledge level items should be included in the draft. The
subsequent test try-out and item analysis will most likely eliminate many of the constructed items in the
draft (either they are too difficult, too easy or non-discriminatory), hence, it will be necessary to construct
more items than will actually be included in the final test form.
Most often, however, the try-out is not done due to lack of time.
Item analysis and try-out. The test draft is tried out to a group of pupils or students. The purpose
of this try out is to determine the (a) Item characteristics through item analysis, and (b) characteristics of
test itself-validity, reliability, and practicality.
Example: The Philippines gained its independence in 1891 and therefore celebrated its centennial year in
2000._____
Obviously the answer is FALSE because 100 years from now is not 2000 but 1989.
Rule 2. Avoid using the words “always", “never", “often" and other words that tend to be either always
true or always fake.
Statements that use the word always are almost always false. A test-wise student can easily guess
his way through this and get high scores even if he doesn't know anything about the test.
Rule 3. Avoid long sentences as these tend to be “true" keep sentences short.
Example: test need to be valid, reliable and useful, although, it would require a great amount of time and
effort to ensure test possesses this test characteristics.________
3
Notice that the statement is true. However, we are also not sure which part of the sentence is
deemed true by the student. It is just fortunate that in this case, all parts of the sentence are true and
hence, the entire sentence is true. The following example illustrates what can go wrong in long sentences:
Example: Tests need to be valid, reliable and useful since it takes very little amount of time, money and
effort to construct a test with these characteristics.
The first part of the sentence is true but the second part is debatable and may, in fact, be false.
Thus a “true" response is correct and also, a “false" response is correct.
Rule 4. Avoid truck statements with some minor misleading word or spelling anomaly, misplaced
phrases, etc. A wise student who does not know the subject matter may detect this strategy and thus get
the answer correctly.
Rule 5. Avoid quoting verbatim from reference material or textbooks. this practice sends the wrong
signal to the students that it is necessary to memorize the textbook word from word and thus acquisition
of higher level thinking skills is not given due importance.
Rule 6. Avoid specific determiners or give-away qualifiers. Students quickly learn that strongly worded
statements are more likely to be false than true, for example, statements with “never" “no" “all" or
“always." Moderately worded statement are more likely to be true than false. Statements that sre
moderately worded use “many" “often" “sometimes" “generally" “frequent" or “some" usually should
be avoided. e.g. Executives usually suffer from hyperacidity. The statement tends to be correct. The word
“usually" leads to the answer.
Rule 7. With true or false question avoid a grossly disproportionate number of either true or false
statements or even patterns in the occurrence of true and false statements.
1.T 1.T
2.F 2.F
3.F 3.T
4.F 4.F
5.F or 5.T
6.F 6.F
7.F 7.T
8.F 8.F
9.F 9.T
10.F 10.F
4
For ease of correction, teachers sometimes create a pattern of True or False answers. Students
will sense it and may arrive at a correct answer not because he/she really knows the answer but because
he/she sense the pattern.
2) Do not use modifiers that are vague and whose meaning a can differ from one person to the next such
as: much, often, usually, etc.
Example:
Much of the process of photosynthesis takes place in the:
a. Bark
b. Leaf
c. Stem
The qualifier “much" is vague and could have been replaced by more specific qualifiers like:"
90% of the photosynthetic process" or some similar phrase that would be more precise.
3) Avoid complex or awkward word arrangements. Also, avoid use of negative in the stem as this may
add unnecessary comprehension difficulties.
Example:
(Poor) As President or the Republic of the Philippines, Corazon Cojuangco Aquino would stand
next to which President of the Philippine Republic subsequent to 1986 EDSA Revolution?
(Better) Who was the President of the Philippines after Corazon C. Aquino?
4) Do not use negatives or double negatives as such statement tend to be confusing. It is best to use
simpler sentences rather than sentences that would require expertise in grammatical construction.
Example:
(Poor) Which of the following will not cause inflation in the Philippine economy?
(Better) Which of the following will cause inflation in the Philippine economy?
Poor: What does statement “Development pattern acquired during the formative years are NOT
Unchangeable" imply?
a.
b.
c.
5
d.
Better: What does statement “Development pattern acquired during the formative years are
changeable" imply?
a.
b.
c.
d.
5) Each item stem should be as short as possible; otherwise you risk testing more for reading and
comprehension skills.
6) Distracters should be equally plausible and attractive.
Example:
The short story: May Day's Eve, was written by which Filipino author?
a. Jose Garcia Villa
b. Nick Joaquin
c. Genoveva Edrosa Matute
d. Robert Frost
e. Edgar Allan Poe
If distracters had all been Filipino authors, the value of the item would be greatly increased. In
this particular instance, only the first three carry the burden of the entire item since the last two can be
essentially disregarded by the students.
7) All multiple choice options should grammatically consistent with the stem.
Example:
As compared to the autos of 1960's autos in the 1980's.
a. traveling slower
b. bigger interiors
c. to use less fuels
d. contain more safety measures
8) The length, explicitness, or degree of technicality of alternatives should not be determinants of the
correctness of the answer. The following is the example of this rule.
Example:
If the three angles of two triangles are congruent then the triangles are:
a. congruent whenever one of the sides of triangles are congruent
b. similar
c. equiangular and therefore, must also be congruent
d. equilateral if they are equiangular
The correct choice, “b," may be obvious from its length and explicitness alone. The other choices
are long and tend to explain why they must be the correct choices forcing the students to think that they
are, in fact, not the correct answers!
B. Which group will most strongly focus its teaching on the interest of the child?
6
a. Progressivist c. Perrenialist
b. Essentialist d. Reconstructionist
One may arrive at a correct answer (letter b) by looking at item a, that gives the answer to b.
10) Avoid alternatives that are synonymous with others or those that, include or overlap others.
Example:
What causes Ice to transform from solid state to liquid state?
A. Change in temperature
B. Change in pressure
C. Change in the chemical composition
D. Change in heat levels
The options a and d are essentially the same. Thus, a student who spot this identical choices
would right away narrow down the field of choices to a, b, and c. The 1st distracter would play no
significant role in increasing the value of the item.
If this happens then the item has two answers, which is not acceptable.
11) Avoid pressuring sequenced items in the same other as in the text.
12) Avoid use of assumed qualifiers that many examinees may not be aware of.
13) Avoid use of unnecessary words or phrases, which are not relevant to the problem at hand (unless
such discriminating ability is the primary intent of the evaluation). The items value is particularly
damaged if the unnecessary material is designed to distract or mislead. Such item test the student's
reading comprehension rather than knowledge of the subject matter.
Example:
The side opposite the thirty degree angle in a right triangle is equal to half the length of the
hypotenuse. If the sine of a 30-degree is 0.5 and its hypotenuse is 5, what is the length of the side of the
opposite the 30-degree angle?
A. 2.5
B. 3.5
C. 5.5
D. 1.5
The sine of the 30-degree angle is really quite unnecessary since the first sentence already gives
the method for finding the length of the side opposite the thirty-degree angle. This case of a teacher who
wants to make sure that no student in his class gets the wrong answer.
14) Avoid use of non-relevant sources of difficulty such as requiring a complex calculation when only
knowledge of a principle is being tested.
Note in the previous example, knowledge of the sine of the 30-degree angle would have led some
students to use the sine formula for calculation even if a simpler approach would have sufficed.
15) Pack the question in the stem. Here is an example of a question which has np question. Avoid it by all
means.
Example:
The Roman Empire.________
A. Had no central government
B. Had no definite territory
C. Had no heroes
D. Had no common religion
7
16) Use the “none of the above" option only when the keyed answer is totally correct. When choice of the
“best" response is intended, “none of the above" is not appropriate, since the implication has already
been made that the correct response may be partially inaccurate.
17) Note that use off “all of the above" may allow credit for partial knowledge. In a multiple option item,
(allowing only one option choice) if a student only knew that two (2) options were correct, he could then
deduce the correctness of “all of the above". This assumes you are allowed only one correct answer.
18) Better still use “none of the above" and “all of the above" sparingly. But best not to use them at all.
19) Having compound response choices may purposefully increase difficulty of an item.
The difficulty of a multiple choice item may be controlled by varying the homogeneity or degree of
similarity of responses. The more homogeneous, the more difficult the item because they all look like the
correct answer.
Example:
(Less Homogeneous)
Thailand is located in:
A. Southeast Asia
B. Eastern Europe
C. South America
D. East Africa
E. Central America
(More Homogeneous)
8
Example: Match the items in column A with the items in column B.
A
___1. First President of the Republic B
___2. National Hero A. Magellan
___3. Discovered in the Philippines B. Mabini
___4. Brain of Katipunan C. Rizal
___5. The great painter
D. Lapu-Lapu
___6. Defended Limasawa island
E. Aguinaldo
F. Juan Luna
2. The stem (longer in construction than the options) must be
G. Antonio in the second column while the
Luna
options (usually shorter) must be in the second column.
3. The options must be more in number than the stems to prevent the student from arriving at the
answer by mere process of elimination.
4. To help the examinee find the answer easier, arrange the options alphabetically or
chronologically.
5. Like any other test, the direction of the test must be given. The examinees must know exactly
what to do.
Mental Exercise
Analyze the matching type of test below. Is this perfect an answer may not be repeated) matching type of
test written in accordance with the guidelines given?
Exercise - Matching Type of Test
Column A
1. Poly Column B
2. Triangle A. Sides
3. Pentagon
4. Square B. Eight-sided polygon
5. Decagon C. Ten-sided polygon
6. Hexagon D. Close plane figure
7. Isosceles triangle E. Irving
8. Octagon F. James
9. Gons G. Melville
10. Circle
H. Mark Twains (Clemens)
I. Wharton
J. Many
Matching type items, unfortunately, often test lower order thinking skills (knowledge level) and
are unable to test higher order thinking skills such as application and judgement skills.
In column 1 are works and writings in American literature and in column 2 are their author. In
some cases, an answer may be repeated.
9
Column A Column B
1. The Alhambra A. Cooper
2. The Pioneers B. Dana
3. The Guardian Angel C. Emerson
4. Two Years Before the Mast D. Holmes
5. Moby Dick E. Irving
6. The World in a Man of War F. James
7. The last of the Mohicans G. Melville
8. The American Scholar H. Mark
Twains
9. The Autocrat of the Breakfast I. Wharton
Table
10. Tom Sawyer
If you intend to make use of this imperfect type of matching test, make sure you indicate so in the
“Direction" to caution the students who usually think an answer may not be repeated.
10
There are other significant item to ask other than specific birthdates.
5. The length of the blanks must not suggest the answer. So better to make the blanks uniform in
size..
A part of speech that names persons, places or things is ______.
A word used to connect clauses or sentences or to coordinate words in the same clause is called
_________.
5.6.2 Essays
Essays, classified as non-objective tests, allow for the assessment of higher order thinking skills.
Such tests require students to organize their thoughts on a subject matter in coherent sentences in order to
inform ban audience. In essay tests, students are required to write one or more paragraphs on a specific
topic.
Essay questions can be used to measure attainment of a variety of objectives.
1. Comparing
- Describe the similarities and differences between...
- Compare the following methods for...
2. Relating cause and effect
- What are the major causes of ...
- What would be the most likely effects of...
3. Justifying
- Which of the following alternatives would you favor and why
- Explain why you agree or disagree with the ff. statement
4. Summarizing
- State the points included in...
- Briefly summarize the contents of...
5. Generalizing
- Formulate several valid generalizations from the following data.
- State a set of principles that can explain the following events.
6. Inferring
- In the light of the facts presented, what is most likely to happen when...
- How would Senator X be most likely to react to the bomb explosion after the bar examination
last September?
7. Classifying
- Group the following items according to...
- What do the following items have in common?
8. Applying
- Using the principles of ___ as guide, describe how you would solve the following problem
situation.
- Describe a situation that illustrates the principle of ____.
9. Analyzing
- Describe the reasoning errors in the following paragraphs.
- List and describe the main characteristics of....
10. Evaluating
- Describe the strengths and weaknesses of the following...
- Using the criteria developed in class, write an evaluation of...
11. Creating
- Make up a story describing what would happen if...
- Design a plan to prove that...
- Write a well-organized report that shows...
11
5.6.2.1 Types of Essay
Restricted Essay
It is also referred to as short focused response. Examples are asking students to “write an
example," “list three reasons," or “compare and contrast two techniques."
Part A Identify at least how other actions that would make Robert's demonstration
better.
Note that all these involve higher-level skills mentioned in Bloom's taxonomy.
The following are rules of thumb which facilitate the scoring of essays:
Rule 1: Phrase the direction in such a way that students are guided on the key concepts to be included.
Specify how the students should respond.
Example
Using details and information from the article (hundred Islands) summarize the main points of
the article. For a complete and correct response, consider these points:
Its history (10 pts)
Its interesting features (10 pts)
Why is it a landmark (5 pts)
12
Non-example
Using details and information from the article hundred Islands) summarize the main points of the
article.
Rule 2: Inform the students on the article on the criteria to be used for grading their essays. This rule
allows the students to focus on relevant and substantive materials rather than on peripheral and
unnecessary facts and bits of information.
Example: Write an essay about the topic: “Plant Photosynthesis" using the keywords indicated. You will
be graded according to the following criteria: (a) coherence, (b) accuracy of the statements, (c) use of
kewords, (d) clarity and (e) extra points for innovative presentation of ideas.
Rule 3: Put a time limit on the essay test.
Rule 4: Decide on your essay grading system prior to getting the essays of your students.
Rule 5: Evaluate all of the students answers to one question before proceeding to the next question.
Scoring or grading essay test question by question, rather than student by student, makes it
possible to maintain a more uniform standard for judging the answers to each question. This procedure
also helps offset the halo effect in grading. When all of the answers on the paper are read together, the
grader's impression on the papers a whole is apt to influence the grades he assigns to the individual
answers. Grading question by question, of course, prevents the formation of this overall impression of the
student's paper. Each answer is more apt to be judged on its own merits when it is read and compared
with other answers to the same question, than when it is read and compared with other answers by the
same student.
Rule 6: Evaluate answers to essay question without knowing the identity of the writer. This is another
attempt to control personal bias during scoring. Answer to essay questions should be evaluated in terms of
what is written, not in terms of what is known about the writers from other contacts with them. The best
way to prevent our prior knowledge from influencing our judgement is to evaluate each answer without
knowing the identity of the writer. This can be done by having the students write their names on the back
of the paper by using the code numbers in place of names.
Rule 7: Whenever possible, have two or more persons grade each answer. The best way to check in the
reliability of the scoring of essay answers is no obtain to or more independent judgements. Although this
may not be a feasible practice for routine classroom testing, it might be done periodically with a fellow
teacher (one who is equally competent in the area). Obtaining two or more independent ratings especially
vital where the results are to be used for important and irreversible decisions, such as in the section of
students for further training or for special awards. Here the pooled ratings of several competent persons
may be needed to attain level of reliability that is commensurate with the significance of the decision
being made.
Some teachers use the cumulative criteria i.e. adding the weights given to each criterion, as basis
for grading while others use the reverse. In the latter method, each student begins with a score of 100
points are then deducted every time a teacher encounters a mistake or when a criterion is missed by the
student in bis essay.
Rule 8: Do not provide optional questions. It is difficult to construct questions of equal difficulty and so
teacher cannot have valid comparison of students achievement.
Rule 9: Provide information about the value/weight of the question and how it will be scored.
Rule 10: Emphasize higher level thinking skills.
Example:
Scientists have found that oceans can influence the temperature of nearby landmasses. Coastal
landmasses tend to have mroe moderate temperatures in summer and winter than inland landmasses of
the same latitude.
Non Example:
Considering the influence of ocean temperatures, explain why inland landmasses temperature
vary in summer and winter to a great degree than coastal temperatures. List three coastal landmasses
13
Answer Sheets
Name:
Year and Section
Date
Exercise Number
5.7 Exercises
Let's have some mental exercise to test your understanding.
EXERCISE l
A. Give non-examples of each following rules of thumb in the construction of a true-false test.
Improve on the non-examples for them to become good examples of tests.
Example:
- Jean Piaget made some revolutionary discoveries about child behavior during the
nineteenth century
- Answer: False. Although Piaget did make discoveries about child behavior, he did so
during the twentieth century.
2. . Avoid using the words “always", “never" and other such adverbs which tend to be always
true or always false.
Example:
- All types of cars have some type of engine.
- Answer: True. Even though the absolute term “all” could tend to make this question false,
the qualifier “some” makes the question more general and allows for possibilities (“some
type of engine”: doesn’t have to be the familiar gasoline- driven engine).
5. Avoid ambiguous sentences which can be interpreted as true and at the same time false.
14
B. Give non examples of each of the following rules of thumb in the construction of multiple choice test.
Improve on the non-examples for them to become good examples of tests.
1. Phrase the stem to allow for only one correct or best answer.
2. Avoid giving away the answer in the stem.
3. Choose distracters appropriately.
4 Choose distracters so that they are all equally plausible and attractive.
5. Phrase questions so that they will test higher order thinking skills.
6. Do not ask subjective questions or options for which there are no right or wrong answer.
EXERCISE ll
A. Construct a 10-item matching type to test this competency: Identify the computer system - i.e.
parts, other components.
B. Construct a 10-item. Supply type test to assess this competency: Identify farm tools according to
use (Grade 7-8 Curriculum Guide; Agriculture fishery.)
15
Plant Photosynthesis
Nature has its own way of ensuring the balance between food producers and consumers. Plants are
considered producers of food for animals. Plants produce food for animals through a process called
photosynthesis. It is a complex process that combines various natural elements on earth into the final
product which animals can consume in order to survive. Naturally, we all need to protect plants so that
we will continue to have food on our table, they cannot perform photosynthesis and animals will also
perish.
G. Give an example of a supply type of test that will measure higher order thinking skills (beyond mere
recall of facts and information.)
H. In what sense is a matching type test a variant of a multiple choice type of test? Justify your answer.
I. In what sense is a supply type of a test considered a variant of multiple choice type of test?
(Hint: In supply type, the choices are not explicitly given). Does this make the supply type of
test more difficult than close multiple choice type of test? How?
J. Choose learning competencies from the k to 12 Curriculum Guide. Construct aligned paper-and-pencil
test observing guidelines in test construction.
16