0% found this document useful (0 votes)
31 views32 pages

AoL1 Module 1

The document defines key terms related to assessment in education including assessment, measurement, testing, evaluation, standardized tests, high-stakes tests, and the principles of assessment. Assessment is defined as the process of gathering information to track learner progress, promote self-reflection, and inform student profiling. Measurement determines attributes or dimensions of an object, skill or knowledge through tests. Evaluation is the process of making judgments based on criteria and evidence gathered through assessments. Standardized tests are administered in a predetermined, standard manner to make norm-referenced or criterion-referenced inferences. High-stakes tests are used to make critical decisions affecting individuals. The principles of a reliable, valid, authentic, objective, practical and interpretable assessment with minimal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views32 pages

AoL1 Module 1

The document defines key terms related to assessment in education including assessment, measurement, testing, evaluation, standardized tests, high-stakes tests, and the principles of assessment. Assessment is defined as the process of gathering information to track learner progress, promote self-reflection, and inform student profiling. Measurement determines attributes or dimensions of an object, skill or knowledge through tests. Evaluation is the process of making judgments based on criteria and evidence gathered through assessments. Standardized tests are administered in a predetermined, standard manner to make norm-referenced or criterion-referenced inferences. High-stakes tests are used to make critical decisions affecting individuals. The principles of a reliable, valid, authentic, objective, practical and interpretable assessment with minimal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

DepEd (2015) defined assessment as “a process to keep track of learners’ progress in

relation to learning standards and in the development of 21 st century skills; to promote

self-reflection and personal accountability among students about their own learning; and

to provide bases for the profiling of student performance on the learning competencies

and standards of the curriculum.”

Meanwhile, Mentz & Lubbe (2021) defined assessment as “the process of gathering

information” wherein the assessor’s intention directly affects tools, methods, and

strategies incorporated in said process. Further, Popham (2017) said that assessment is

“a formal attempt to determine students’ status in educational variables of interest”.

These definitions clearly shows that assessment must be done with clearly-stated aims

and objectives.
Here are other definitions of assessment:

▪ “a process for documenting, in measurable terms, the knowledge, skills, attitudes, and beliefs

of the learner” (Delclos et. al., 1992; Poehner, 2007)

▪ “the collection of relevant information that may be relied on for making decisions” (Fenton,

1996)

▪ “a related series of measures used to determine complex attribute of an individual or group of

individuals. It is the process of observing and measuring learning” (Oosterhof, 2001)


Saint Thomas University (2018) defined measurement as “determining the
attributes or dimensions of an object, skill or knowledge.” As discussed
earlier, measurement tools in education comprise of tests conducted
during the learning process. Examples of measures include average, mean
grade, percentile rank, etc. More of these will be discussed in later
modules.
Measurement is synonymous with testing. For brevity, a test is “used to examine
someone's knowledge of something to determine what he or she knows or has learned”
(Penn State University, 2017).

McMillan (2017) defined it as “a systematic process of assigning numbers to behavior


or performance and is used to determine how much of a trait, attribute, or characteristic
an individual possesses.”

Testing also includes all the physical procedures done in the administration of the tests.
Penn State University (2017) states that evaluation is “the process of making
judgments based on criteria and evidence.”

This is backed up by the definition of St. Thomas University (2018) where


evaluation is “the process of using the measurements gathered in the
assessments”. They further added that information gathered in assessments are
used to link “what was intended” to “what was learned” through the
measurements stated.
The process of evaluation also allows teachers to determine “what
students know and understand, how far they have progressed and
how fast, and how their scores and progress compare to those of
other students.” As much as getting information from the learners,
giving interpretations to these information is equally important in the
learning process.
According to Popham (2017), a standardized test “is a test, designed to yield

either norm-referenced or criterion-referenced inferences, that is administered,

scored, and interpreted in a standard, predetermined manner.” An example of a

standardized test is the National Achievement Test,UPCAT, and other large-


scale national tests.

Oxford Bibliographies (2023) defines a test to be high stakes if “the outcomes


are used to make decisions about promotion, admissions, graduation, and
salaries.”

This is in line with a prior definition by Madaus & Keillor (1988) as the use of
any test in making critical decisions affecting an individual or a group of
individuals. Results of such a test can lead to “punishments, rewards, or
advancement of individuals or programs”.
Federation University Australia (2023) stated four functions of assessment in education:
1. Certification – Marking and certifying students’ achievement through a collection of activities
such as assignments, exams, and performance tasks.
2. Quality assurance – Ensuring institutional and academic standards are scrutinized through careful
study of students’ output.
3. Learning – Completion of assessment tasks engages student in the learning process and provides
them with “formative and diagnostic functions” relevant to their progress.
4. Lifelong learning – “Developing students' ability to self-assess and self-regulate their learning
beyond formal requirements.”
In a lecture by Choo (2014), seven basic principles of assessment was discussed
which are as follows:
• Reliability • Validity • Practicality
• Authenticity • Objectivity • Interpretability • Washback Effect
Retrieved from Choo
Brown (2010, cited from Choo, 2014) states that an assessment tool is reliable if:

▪ “Consistent in its conditions across two or more administrations. ▪

Gives clear directions for scoring or evaluation

▪ Has uniform rubrics (criteria for pointing) for scoring/evaluation ▪ Lends

itself to consistent application of those rubrics by the scorer ▪ Contains


items/tasks that are unambiguous to the test-taker”
Choo (2014) further added that reliable tools should be able to produce stable
and consistent results where McMillan (2001, cited in Choo, 2014) added that
“reliability also denotes consistency, stability, dependability, and accuracy of
assessment results”.

It is important to note that scores change depending the takers, scorers (usually
teachers), and other factors which implies reliability as a critical factor in
properly assessing learning.
There are different ways to establish the reliability of an assessment tool; these
ways will be discussed in later modules. However, several factors directly affect
the reliability of any assessment tool. Choo (2014) has enumerated them as
follows.
Teacher and Student Factors
Factors
Environment
Factors

Assessment
Test
Administration Reliability Test Factors
Marking Factors
▪ Longer tests are generally more reliable.

▪ Some test-takers may be dependent on guessing answers. Having more items will not

reduce the tendency of guessing but will result in a more accurate score as one,
usually, does not entirely guess an entire exam or task.

▪ Further, objective tests (tests with items having an objectively correct answer)

provide consistency in scoring as correct answers are not open for interpretation.
▪Not all tests that are reliable may be valid as consistent scores may

not necessarily measure what the tester aims to measure.

▪Test items created from the same subject matter still has variability

even if the same tests are “equivalent”.


In the educational setting, teachers normally construct tests and administer
it to their students. This connection, as argued by Choo (2014), increases
test results’ consistency. Choo (2014) also added that “teachers’
encouragement, positive mental and physical condition, familiarity to the
test formats, and perseverance and motivation” contribute to increasing
reliability.
One can hypothesize that environments that have uncomfortable
ventilation, excessive noise, rickety chairs and desks, and a lack of
light sources will adversely affect the reliability of tests and
students’ general performance in said tests.
Choo (2014) states that “students’ grades are dependent on the way
tests are being administered” hence test administrators must give
concise instructions, appropriate test time, and critical monitoring of
tests to increase its reliability.
Humans provide variability and error in scoring in essays and similar
types of tests (Linn & Gronlund, 2000; Weigle, 2002) resulting in
markers biases. Markers also have the tendency to give different
scores to the same answers even if a rubric is already in place. One
way to reduce the effect of marking factors in test reliability is to use
objective tests.
This principle is defined by McMillan (2017) as “a characteristic that refers to
the appropriateness of the inferences, uses, and consequences that result from
the assessment and is concerned with the soundness, trustworthiness, or
legitimacy of the claims or inferences that are made based on obtained scores.”
HBA Learning Centre (2021) says that validity ascertains that the assessment
measures an intended outcome rather than anything else.
For example, if one wants to measure a student’s ability to add fractions,
one should construct test items related to adding fractions. Not have them
state what a fraction is. In other words, one should ask “What is ½ plus
1/3?” rather than “What is the numerator of ¾?”.
The different types of validity will be established in later modules.
Bachman & Palmer (1996, cited in Choo, 2014) states that authenticity is “the degree of
correspondence of the characteristics of a given task to the features of the target task”.

Messier (2022) defines an assessment to be authentic if it “involve the application of


knowledge and skills in real-world situations, scenarios, or problems; and creates a
student-centered learning experience by providing students opportunities to problem-
solve, inquire, and create new knowledge and meaning.”
Choo (2014) asserts that a test is authentic if it:

▪ contains “language that is natural” or normal.

▪ is composed of items contextualized in real-life scenario. ▪

includes “meaningful, relevant, and interesting topics”

▪ “provides thematic organization to items such as through a story line or episode”


▪ offers tasks that replicate real-world tasks.
Choo (2014) defines objectivity as the extent in which “equally competent
scorers obtain the same result”. Tests such as multiple-choice test has high
objectivity due to having a singular correct answer and has no reliance on
tester’s skills or emotions. Standardized tests such as the National
Achievement Test and wide-scale tests used in college admissions are also
examples of tests with high objectivity.
Alternatively, tests such as essays have low objectivity as scores in
these tests depends on different factors such as examiners’ skills,
rubrics, and other biases. It should be noted that teachers must not
only rely on purely objective tests and, instead, base the type of
assessment tool on the aims of the assessor.
This principle is defined by Choo (2014) as “the logistical, down-to earth,
administrative issues involved in making, giving, and scoring an assessment
instrument.” Another definition is given by Mousavi (2009) as “costs, the
amount of time it takes to construct and to administer, ease of scoring, and ease
of interpreting/reporting the results.” Teachers must note that both them and the
test-takers allot certain resources to administer or take any given assessment
tool.
Choo (2014) states that assessment is practical if it:

▪ Stays within budgetary limits

▪ Can be completed by the test-taker within an appropriate time frame ▪ Has


clear directions for administration

▪ Appropriately utilizes available human resource


▪ Does not exceed available material resource

▪ Considers the time and effort involved for both design and scoring
Interpreting scores is the process of attaching meaning to them. Is a score of
7/12 a “good” score for a student?

Doing this process requires the interpreter to have “knowledge about the test,
which can be obtained by studying the manual or related tools along with
current research literature with respect to its use.”

It is imperative to only interpret scores once these prerequisite knowledge are


acquired.
Hughes (2003) defines ‘washback’ as the impact tests have on
teaching and learning. This impact may not be entirely positive or
negative. It should be noted that tests have the capacity to facilitate
change whereas a bad test results to a negative backwash and a good
test results to positive backwash (Alderson, 1986; Pearson, 1988).
Brown (2010, extracted from Choo, 2014) listed several factors that provide positive
washback in a test:

▪ Positively influence what and how teachers teach and learners learn ▪ Offer

learners a chance to adequately prepare

▪ Give learners feedback that enhance their development

▪ Provide conditions for peak performance of learners

▪ Focus on formative rather than summative


Washback in large-scale assessments refer to the effects tests have in
instruction regarding students’ preparation and requirements for
subsequent tests (Choo, 2014). For example, review centers often test their
enrollees to determine their weakest points. In the education sector, the
challenge lies in how teachers could create classrooms that produce
positive washback.
On a separate note, Gravells (2014) laid out two principles of
assessment as VACSR and SMART with the former used as an aid
in planning and assessing “in a way that should meet the expected
requirements” while the latter is used in setting clear expectations on
what your learners need to achieve and communicating such
requirements to them.
Valid – the assessment process is appropriate to the subject or qualification, assesses only
what is meant to be assessed, and the learners’ work is relevant to the assessment requirements.
Authentic – the work has been produced solely by the learner.
Current – the work is still relevant at the time of assessment.
Sufficient – the work covers all assessment requirements.
Reliable – the work is consistent across all learners, over time and at the required level, i.e., if
the assessment was carried out again with similar learners, similar results would be achieved.
Specific – the activity relates on to what is being assessed and is clearly stated.
Measurable – the activity can be measured against the assessment requirements,
allowing any gaps to be identified.
Achievable – the activity can be achieved at the right level by the learner
Relevant – the activity is realistic and will give consistent, valid and reliable results.
Time-bound – target dates and times can be achieved.
In DepEd Order 8 series of 2015, two types of assessments are identified: formative
assessment and summative assessment. Earlier, Gravells (2014) identified three more
types aside from those released by DepEd. These additions are initial, diagnostic, and
holistic assessment. Further, assessments may be formal or informal where formal
assessment results count towards a completion of a requirement and an informal one
checks learner progression at any given point in the lesson.

Initial assessment aims to give teachers relevant information about learners such
as their specific assessment requirements or needs. This assessment also ensures
that the learners are in the proper program and is, thus, done before any learning
begins (Gravells, 2014). Initial assessments also ensure any entry requirements
to a program is satisfied. A simple question of “What are your perceptions on
Filipino Literature?” is an example of initial assessment conduct.
Gravells (2014) states that initial assessment can:

▪ allow for differentiation and individual requirements to be met

▪ ascertain why the learner wants to take the program along with their capability to achieve
▪ find out the expectations and motivations of your learner

▪ give your learner the confidence to negotiate suitable targets

▪ identify any information which needs to be shared with colleagues ▪

identify any specific additional support needs

Diagnostic assessment allows teachers to identify a learner’s current skills,


knowledge and understanding towards a particular subject area and through
demonstration, discussion, or performance of said understanding. Gravells
(2014) further argued that “the [diagnostic assessment] results will give a
thorough indication of not only the level at which your learner needs to be
placed for their subject but also which specific aspects they need to improve
on”.
Gravells (2014) stated that diagnostic assessment can:
▪ ascertain learning preferences

▪ enable learners to demonstrate their current level of skills, knowledge and understanding ▪ ensure

learners can access support such as study skills

▪ identify an appropriate starting point and level for each learner

▪ identify gaps in skills, knowledge and understanding to highlight areas to work on ▪ identify

previous experience, knowledge, achievements and transferable skills ▪ identify specific

requirements: for example, English, math and ICT skills

DepEd (2015) considers formative assessment to be assessment for learning


where teachers adjusts the learning process based on the results, and assessment
as learning where students reflect on their own learning through feedback of
assessment results. DepEd (2015) added that this assessment is
“characteristically informal and is intended to help students identify strengths
and weaknesses to learn from the assessment experience”.
Formative assessment could be performed at any stage of the teaching process
and gives teachers an idea on the effectiveness of their instruction. This
assessment is done through interaction with students and delivering feedback to
allow them to improve their learning. Further, formative assessments are to be
recorded as evidences of learning which is later analyzed through a
developmental sense to better monitor student progress throughout the learning
proper.
Formative assessments also imbibe learners with accountability and
responsibility over their own learning. Teachers are tasked to give
recommendations on how to improve learners’ performance based on the results
of formative assessment; which leaves the students to decide on deliberately
performing the action. However, results of the formative assessments are not
graded nor included as part of summative assessment.
Earl (2013) states that assessment for learning (AfL) practices aim to “close the
gap between existing and anticipated learning”. Hence, AoL’s main goal is to
improve learning all throughout the learning episodes. While assessment as
learning (AaL) focus on self-assessment, self-regulation, self-monitoring, and
metacognition; further emphasizing the active role of learners in improving their
own learning (Mentz & Lubbe, 2021).
Extracted from
Mentz & Lubbe
(2021)
DepEd (2015) states that summative assessment as assessment of learning which
happens at the end of the unit, chapter, program, or any period of learning to
determine if the objectives of respective parts have been achieved by the
learners.

DepEd added that this assessment occur to make “appropriate decisions about
future learning or job suitability”.
Meanwhile, UNESCO Program on Teaching and Learning for a
Sustainable Future (UNESCO-TLSF, cited in DepEd, 2015) states
that summative assessment is for the “benefit of the people rather
than of the learner” as this assessment measures if a certain test taker
can demonstrate or show certain skills, knowledge, or capabilities.
Gravells (2014) remarks that summative assessments have “a
tendency to teach purely what is required to achieve a pass which
does not maximize a learner’s ability or potential; and may not help
them in life or in work as they are not able to put theories into
practice”; which emphasizes the need for authenticity in summative
assessments.
Gravells (2014) defined holistic assessment as “a method of assessing several
aspects of a qualification, unit, program or job specification at the same time.”
This is also characterized by “a more efficient and quicker system as one piece
of good quality evidence or a carefully planned observation.” Further, this
assessment allows learners to integrate knowledge and skills through
demonstration aided or supplemented by questioning for further scrutiny.
Extracted
from
McMillan
(2017)
In this juncture, McMillan’s (2017) enumeration of different types of assessment methods are used
and these methods are:

▪ Selected-response

▪ Constructed-response

▪ Performance-based

▪ Essay

▪ Oral Questioning

▪ Teacher Observations

▪ Student Self-assessment
Selected-response tests shows students a question or item and a set of responses
they may choose from. Tests using these methods include multiple-choice test,
binary choice or otherwise known as true or false, identification, or matching
type; and usually are objective – items only have one correct or best answer –
which makes them easier to assess due to its independence of judges’ biases and
the simple counting of “correct” answers selected by the learners.
This assessment method requires learners to construct or produce their own answers or
outputs to a question or item. Some constructed-response tests are brief and requires
students to write short, concise answer to items. Examples of these are fill-in-the-
blanks, short-answer test, or simple mathematical problem-solving. Brief constructed-
response items are also objective where there is most often a single correct answer or
correct idea that needs to be seen in the responses of the learners.
Performance tasks are a type of constructed-response test that requires learners to make
an “extensive and elaborate response”. These assessments are well-defined and ,
according to McMillan (2017), asks students to “create, produce, or do something, often
in settings that involve real-world application of knowledge and skills through which
students show their proficiency.” Responses in this method can be performance or
product. Samples of performance include dances, speeches, recitals, etc.; while
examples of products includes portfolios, paintings, posters, or models.
These constructed-response items enforce students to write extended and
comprehensive responses that range from a few paragraphs (restricted-
response) to multiple pages of papers long (extended-response). McMillan
(2017) asserts that “restricted-response essay items include limits to the
content and nature of the answer, whereas extended-response items allow
greater freedom in response”.
In the classroom, teachers informally ask learners several questions
for formative reasons; which implies that these questions are
informal. In a formalized format, questioning is used to “to test or to
determine student understanding through interviews or conferences”
(McMillan, 2017).
Naturally, teachers observe their students whenever a learning episode is
occurring. This occurrence is apparent enough to be considered as a
natural action in class. However, teachers must note that non-verbal cues
such as “squinting, inattention, looks of frustration, and other cues” are
more useful than verbal ones (McMillan, 2017). Further, teacher
observations are also useful in assessing classroom conditions and
instruction.
This method of assessment pits students’ own performance to established
standards using their own judgement. Meanwhile, self-report inventories
are forms or questions asking about students’ attitudes and beliefs about
others or themselves (McMillan, 2017). Peer assessment may also be used
where students evaluate their classmates but is prone to problems.
Different institutions use varied methods for assessment in
accordance with policies or standards unique to them. Examples of
an exhaustive list for assessment methods used could be found here.
It is recommended to browse them.
Figure retrieved
froMentz & Lubbe
(2021

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy