0% found this document useful (0 votes)
24 views24 pages

Assessment of Learning Handout

The document discusses assessment of learning including tests, measurement, assessment and evaluation. It describes different modes of assessment such as traditional tests, performance assessments, and portfolios. It also outlines principles of quality assessment and domains of learning including cognitive, affective, and psychomotor.

Uploaded by

Jojimar Julian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views24 pages

Assessment of Learning Handout

The document discusses assessment of learning including tests, measurement, assessment and evaluation. It describes different modes of assessment such as traditional tests, performance assessments, and portfolios. It also outlines principles of quality assessment and domains of learning including cognitive, affective, and psychomotor.

Uploaded by

Jojimar Julian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

ASSESSMENT OF LEARNING

Test
0
An instrument designed to measure any quality, ability, skill or knowledge.
0
Comprised of test items of the area it is designed to measure.

Measurement
0
A process of quantifying the degree to which someone/something possesses a given
trait (i.e. quality, characteristics or features)
0
A process by which traits, characteristics and behaviour’s are differentiated.

Assessment
0 0
A process of gathering and organizing data into an interpretable form to have basis for
decision-making
0
It is a prerequisite to evaluation. It provides the information which enables evaluation to
take place.

Evaluation
0
A process of systematic analysis of both qualitative and quantitative data in order to
make sound judgment or decision.
0
It involves judgment about the desirability of changes in students.

MODES OF ASSESSMENT
MODE DESCRIPTION EXAMPLES
ADVANTAGE DISADVANTAGE
The objective paper- ■ Standardized ■S Scoring is ■ Preparation of
S
and-pen test which Tests objective instrument is
Traditional
usually assesses low- ■ Teacher-made ■ Administration timeconsuming
level thinking skills Tests is easy ■ Prone to cheating
because
A mode of ■ Practical Test ■ Preparation of ■ Scoring tends to
assessment that ■ Oral and Aural the instrument be subjective
Performance requires actual
Tests is relatively without rubrics
demonstration of ■ Projects easy ■ Administration is
skills or creation of ■ Measures time consuming
A process of ■ Working ■ Measures ■ Development is
gathering multiple Portfolios student’s time consuming
Portfolio
indicators of student ■ Show growth and ■ Rating tends to
progress to support Portfolios development be subjective
course goals in without rubrics
of
notprerequisite
certifies
■ determine
mastery
graded recumng
of

serve as a pretest for the next or persistent difficulties


unit administered during instruction
determines the extent of designed to formulate a plan
what the pupils for modify the teaching and

learning

objectives of the intended


instruction

determine the students


strength and

place the students in specific


learning
groups to facilitate teaching
and learning

serve as a pretest for the


next unit

serve as basis in planning for


a relevant

1) Clarity of Learning Targets


0
Clear and appropriate learning targets include (1) what students know
and can do and (2) the criteria for judging student performance.

2) Appropriateness of Assessment Methods


0
The method of assessment to be used should match the learning
targets. \

3) Validity
0
This refers to the degree to which a score-based inference is
appropriate, reasonable, and useful.

4) Reliability
0
This refers to the degree of consistency when several items in a test
measure the same thing, and stability when the same measures are
given across time.

5) Fairness
0
Fair assessment is unbiased and provides students with opportunities
to demonstrate what they have learned. / - r-,C ' .. <. \
W

6) Positive Consequences
0
The overall quality of assessment is enhanced when it has a positive
effect on student motivation and study habits. For the teachers, high-
quality assessments lead to better information and decision-making
about students.

7) Practicality and efficiency x


0
Assessments should consider the teacher’s familiarity with the method,
the time required, the complexity of administration, the ease of
scoring and interpretation, and cost.
A. COGNITIVE DOMAIN
Levels of Learning Description Some Question Cues
Outcomes
■ Involves remembering or
recalling ■ List, define, identify, name,
Knowledge
previously learned material or a wide recall, state, arrange
range
■ Ability to of materials
grasp the meaning of material by ■ Describe, interpret, classify,
Comprehension
translating material from one form to differentiate, explain,
another
■ Ability or by
to use interpreting
learned material
material translate
in new and ■ Apply, demonstrate, solve,
Application
concrete situations interpret, use, experiment
■ Ability to break down material into its ■ Analyse, separate, explain,
Analysis
component parts so that the whole examine, discriminate,
structure
■ Ability to putis parts
understood infer
together to form a new ■ Integrate, plan, generalize,
Synthesis
whole construct, design, propose
* Ability to judge the value of material on the ■ Assess, decide, judge,
Evaluation
basis of a definite criteria support, summarize,
defend

B. AFFECTIVE DOMAIN
Categories Description Some Illustrative Verbs
■ Willingness to receive or to attend to a ■ Acknowledge, ask, choose,
Receiving
particular phenomenon or stimulus follow, listen, reply, watch

■ Refers to active participation on the part of ■ Answer, assist, contribute,


Responding
the student cooperate, follow-up, react

■ Ability to see worth or value in a subject, ■ Adopt, commit, desire, display,


Valuing
activity, etc. explain, initiate, justify, share
■ Bringing together a complex of values,
■ Adapt, categorize, establish,
Organization resolving conflicts between them, and
generalize, integrate, organize
beginning to build an internally
■ Values have been
consistent valueinternalized
system and have
Value ■ Advocate, behave, defend,
controlled ones' behaviour for a
Characterization encourage, influence, practice
sufficiently long period of time

C. PSYCHOMOTOR DOMAIN
Categories Description Some Illustrative Verbs
■ Early stages in learning a complex skill after an ■ Carry out, assemble,
Imitation
indication of readiness to take a particular type practice, follow, repeat,
of action. sketch, move
■ A particular skill or sequence is practiced (same as imitation) •
Manipulation continuously until it becomes habitual and done acquire, complete,
with some confidence and proficiency. conduct, improve,
(same as imitation and
■ A skill has been attained with proficiency and
Precision manipulation)
efficiency.
• Achieve, accomplish, excel,
master, succeed, surpass
■ An individual can modify movement patterns to a ■ Adapt, change, excel,
Articulation reorganize, rearrange,
meet a particular situation.
revise
■ An individual responds automatically and creates
Naturalization ■ Arrange, combine, compose,
new motor acts or ways of manipulation out of
construct, create, design
understandings, abilities, and skills developed.
DIFFERENT TYPES OF TESTS
MAIN POINTS FOR TYPES OF TESTS
COMPARISON
Psychological Educational
■ Aims to measure students ■ Aims to measure the result of
intelligence or mental ability in instructions and learning (e.g.
Purpose
a large degree without Achievement Tests,
reference to what the students Performance Tests)
has learned (e.g. Aptitude

Survey Mastery
■ Covers a broad range of ■ Covers a specific objective
Scope of Content objectivesgeneral achievement
■ Measures ■ Measures fundamental skills
in certain subjects
■ Constructed by trained and abilities
■ Typically constructed by the
professional teacher

Verbal Non-Verbal
Language Mode ■ Words are used by students in ■ Students do not use words in
attaching meaning to or attaching meaning to or in
responding to test items responding to test items

Standardized Informal
■ Constructed by a professional ■ Constructed by a classroom
item writer
■ Covers teacher
a broad range of content ■ Covers a narrow range of
covered in a subject area content
Construction ■ Uses mainly multiple choice ■ Various types of items are used
■ Items written are screened and ■ Teacher picks or writes items as
the best items were chosen for needed for the test
■ Can
the be scored
final by a machine
instrument ■ Scored manually by the teacher
■ Interpretation of results is ■ Interpretation is usually
usually norm-referenced criterion-referenced

Individual Group
■ Mostly given orally or requires • This is a paper-and-pen test
actual demonstration
■ One-on-one situations, of skill
thus, ■ Loss of rapport, insight and
Manner of
Administration many opportunities for clinical knowledge about each
■ Chance to follow-up examinee’s
observation ■ Same amount of time needed to
examinee
response in order to clarify or gather information from one
comprehend it more clearly student
Objective Subjective
■ Scorer's personal judgment ■ Affected by scorer’s personal
Effect of Biases does not
■ Worded affect
that onlythe
onescoring
answer is opinions,
- Several biasesare
answers andpossible
acceptable
■ Little or no disagreement on ■ Possible to disagreement on
what is the correct answer what is the correct answer
Power Speed
■ Consists of series of items ■ Consists of items
Time Limit and Level
arranged in ascending order of approximately equal in difficulty
of Difficulty
■ Measures
difficulty student’s ability to ■ Measure’s student’s speed or rate
answer more and more difficult and accuracy in responding
items

Selective Supply
■ There are choices for the answer ■ There are no choices for the

■ Multiple choice, True or False, answer


■ Short answer, Completion,
Format Matching
■ Can Type quickly
be answered Restricted or Extended Essay
■ May require a longer time to answer.
■ Prone to guessing ■ Less chance to guessing but prone

■ Time consuming to construct to bluffing


■ Time consuming to answer and
score

Maximum Performance Typical Performance


Nature of Assessment ■ Determines what individuals can ■ Determines what individuals will do
do when performing at their best under natural conditions

Norm-Referenced Criterion-Referenced
■ Result is interpreted by comparing ■ Result is interpreted by comparing
one student’s performance with student’s performance based on
other students’ performance a predefined standard (mastery)
■ Some will really pass ■ All or none may pass
■ There is competition for a limited ■ There is no competition for a
percentage of high scores limited percentage of high score
■ Typically covers a large domain of ■ Typically focuses on a delimited
Interpretation
learning tasks domain of learning tasks
■ Emphasizes discrimination among ■ Emphasizes description of what
individuals in terms of level of learning tasks individuals can and
■ Favors items of average difficulty
learning ■ Matches item difficulty to learning
cannot perform
and typically omits very easy and tasks, without altering item
very hard items difficulty or omitting easy or hard
■ Interpretation requires a clearly ■ Interpretation requires a clearly
defined group defined and delimited
achievement domain

Four Commonly-used References for Classroom Interpretation


Reference Interpretation Provided Condition That Must Be Present
Ability- How are students performing relative to Good measures of the students’
referenced what they are capable of doing? maximum possible performance
Growth- How much have students changed or Pre- and Post- measures of
referenced improved relative to what they were doing performance that are highly reliable
How well are students doing with respect Clear understanding of whom students
earlier?
Norm- referenced to what is typical or reasonable? are being compared to
Criterion- Well-defined content domain that was
What can students do and not do?
referenced assessed.
TYPES OF TEST ACCORDING TO FORMAT

1. Selective Type - provides choices for the answer

a. Multiple Choice - consists of a stem which describes the problem and 3 or more alternatives which
give the suggested solutions. The incorrect alternatives are the distractors.

b. True-False or Alternative Response - consists of declarative statement that one has to mark true
or false, right or wrong, correct or incorrect, yes or no, fact or opinion, and the like.

c. Matching Type - consists of two parallel columns: Column A, the column of premises from which a
match is sought: Column B. the column of responses from which the selection is made.

Type Advantages Limitations

■ More adequate sampling of content ■ Prone to guessing


Multiple
Choice

■ Tend to structure the problem to be ■ Often indirectly measure targeted


addressed more effectively behaviors
■ More adequate sampling of content
■ Prone to guessing
Response
Alternate

■ Easy to construct
■ Can be used only when dichotomous
■ Can be effectively and objectively
answers represent sufficient response
scored
options
■ Allows comparison of related ideas, ■ Difficult to produce a sufficient number
Matching Type

concepts, or theories of plausible premises


■ Effectively assesses association ■ Not effective in testing isolated facts
between a variety of items within a ■ May be limited to lower levels of
topic understanding
■ Encourages integration of information ■ Useful only when there is a sufficient

2. Supply Test
a. Short Answer - uses a direct question that can be answered by a word, phrase, a number, or a
symbol
b. Completion Test - consists of an incomplete statement

Advantages Limitations
■ Easy to construct « Generally limited to measuring recall of
■ Require the student to supply the information
answer • More likely to be scored erroneously due
to a variety of responses

3. Essay Test
a. Restricted Response - limits the content of the response by restricting the scope of the topic
b. Extended Response - allows the students to select any factual information that they think is
pertinent, to organize their answers in accordance with their best judgment

Advantages Limitations
■ Measure more directly behaviors • Provide a less adequate sampling of
specified by performance objectives content
■ Examine students’ written • Less reliable scoring
communication skills • Time-consuming to score
GENERAL SUGGESTIONS IN WRITING TESTS

1. Use your test specifications as guide to item writing.


2. Write more test items than needed.
3. Write the test items well in advance of the testing date.
4. Write each test item so that the task to be performed is clearly defined.
5. Write each test item in appropriate reading level.
6. Write each test item so that it does not provide help in answering other items in the test.
7. Write each test item so that the answer is one that would be agreed upon by experts.
8. Write test items so that it is the proper level of difficulty.
9. Whenever a test is revised, recheck its relevance.

SPECIFIC SUGGESTIONS
A. SUPPLY TYPE

1. Word the item/s so that the required answer is both brief and specific.
2. Do not take statements directly from textbooks to use as a basis for short answer items.
3. A direct question is generally more desirable than an incomplete statement.
4. If the item is to be expressed in numerical units, indicate type of answer wanted.
5. Blanks should be equal in length.
6. Answers should be written before the item number for easy checking.
7. When completion items are to be used, do not have too many blanks. Blanks should be at the
center of the sentence and not at the beginning.

Essay Type

8. Restrict the use of essay questions to those learning outcomes that cannot be satisfactorily
measured by objective items.
9. Formulate questions that will cell forth the behavior specified in the learning outcome.
10. Phrase each question so that the pupils’ task is clearly indicated.
11. Indicate an approximate time limit for each question.
12. Avoid the use of optional questions.

B. SELECTIVE TYPE
Alternative-Response
1. Avoid broad statements.
2. Avoid trivial statements.
3. Avoid the use of negative statements especially double negatives.
4. Avoid long and complex sentences.
5. Avoid including two ideas in one sentence unless cause and effect relationship is being
measured.
6. If opinion is used, attribute it to some source unless the ability to identify opinion is being
specifically measured.
7. True statements and false statements should be approximately equal in length.
8. The number of true statements and false statements should be approximately equal.
9. Start with false statement since it is a common observation that the first statement in this type is
always positive.

Matching Type
1. Use only homogenous materials in a single matching exercise.
2. Include an unequal number of responses and premises, and instruct the pupils that response
may be used once, more than once, or not at all.
3. Keep the list of items to be matched brief, and place the shorter responses at the right.
4. Arrange the list of responses in logical order.
5. Indicate in the directions the bass for matching the responses and premises.
6. Place all the items for one matching exercise on the same page.

Multiple Choice
1. The stem of the item should be meaningful by itself and should present a definite problem.
2. The item should include as much of the item as possible and should be free of irrelevant
information.
3. Use a negatively stated item stem only when significant learning outcome requires it.
4. Highlight negative words in the stem for emphasis.
5. All the alternatives should be grammatically consistent with the stem of the item.
6. An item should only have one correct or clearly best answer.
7. Items used to measure understanding should contain novelty, but beware of too much.
8. All distracters should be plausible. m .A&
9. Verbal association between the stem and the correct answer should be avoided.
10. The relative length of the alternatives should not provide a clue to the answer.
11. The alternatives should be arranged logically. '
12. The correct answer should appear in each of the alternative positions and approximately equal
number of times but in random number.
13. Use of special alternatives such as “none of the above” or “all of the above” should be done
sparingly.
14. Do not use multiple choice items when other types are more appropriate.
15. Always have the stem and alternatives on the same page.
16. Break any of these rules when you have a good reason for doing so.

ALTERNATIVE ASSESSMENT

PERFORMANCE AND AUTHENTIC ASSESSMENTS

When To Use ■ Specific behaviors or behavioural outcomes are to be observed


■ Possibility of judging the appropriateness of students’ actions
■ Allow evaluation of complex skills which are difficult to assess using
Advantages written tests
■ Positive effect on instruction and learning
■ Time-consuming to administer, develop, and score
Limitations
■ Subjectivity in scoring

PORTFOLIO ASSESSMENT
Characteristics:
1. Adaptable to individualized instructional goals
2. Focus on assessment of products
3. Identify students’ strengths rather than weaknesses
4. Actively involve students in the evaluation process
5. Communicate student achievement to others
6. Time-consuming
7. Need of a scoring plan to increase reliability
TYPES DESCRIPTION
Showcase ■ A collection of students’ best work

Reflective • Used for helping teachers, students, and family members think about various
dimensions of student learning (e.g. effort, achievement, etc.)
• A collection of items done for an extended period of time
Cumulative
■ Analyzed to verify changes in the products and process associated with student
learning of works chosen by students and teachers to match pre-established
• A collection
Goal-based
objectives
■ A way of documenting the steps and processes a student has done to complete
Process
a piece of work

RUBRICS
^ scoring guides, consisting of specific pre-established performance criteria, used in evaluating
student work on performance assessments

Two Types:
1. Holistic Rubric - requires the teacher to score the overall process or product as a whole, without
judging the component parts separately
2. Analytic Rubric - requires the teacher to score individual components of the product or
performance first, then sums the individual scores to obtain a total score \

AFFECTIVE ASSESSMENTS
1. Closed-Item or Forced-choice Instruments - ask for one or specific answer
o' a. Checklist - measures students’ preferences, hobbies, attitudes, feelings, beliefs, interests, etc. by
marking a set of possible responses

b. Scales - these instruments that indicate the extent or degree of one’s response
1) Rating Scale - measures the degree or extent of one’s attitudes, feelings, and perception
about ideas, objects and people by marking a point along 3- or 5- point scale
2) Semantic Differential Scale - measures the degree of one’s attitudes, feelings and perceptions
about ideas, objects and people by marking a point along 5- or 7- or 11- point scale of
semantic adjectives
3) Likert Scale - measures the degree of one’s agreement or disagreement on positive or
negative statements about objects and people

c. Alternate Response - measures students preferences, hobbies, attitudes, feelings, beliefs,


interests, etc. by choosing between two possible responses
d. Ranking - measures students preferences or priorities by ranking a set of responses

2. Open-Ended Instruments - they are open to more than one answer


a. Sentence Completion - measures students preferences over a variety of attitudes and allows
students to answer by completing an unfinished statement which may vary in length
b. Surveys - measures the values held by an individual by writing one or many responses to a
given question
c. Essays - allows the students to reveal and clarify their preferences, hobbies, attitudes, feelings,
beliefs, and interests by writing their reactions or opinions to a given question

SUGGESTIONS IN WRITING NON-TEST OF ATTITUDINAL NATURE


1. Avoid statements that refer to the past rather than to the present.
2. Avoid statements that are factual or capable of being interpreted as factual.

3. Avoid statements that may be interpreted in more than one way.


4. Avoid statements that are irrelevant to the psychological object under consideration.
5. Avoid statements that are likely to be endorsed by almost everyone or by almost no one.
6. Select statements that are believed to cover the entire range of affective scale of interests.
7. Keep the language of the statements simple, clear and direct.
8. Statements should be short, rarely exceeding 20 words.
9. Each statement should contain only one complete thought.
10. Statements containing universals such as all, always, none and never often introduce ambiguity
and should be avoided.
11. Words such as only, just, merely, and others of similar nature should be used with care and
moderation in writing statements.
12. Whenever possible, statements should be in the form of simple statements rather than in the
form of compound or complex sentences.
13. Avoid the use of words that may not be understood by those who are to be given the completed
scale.
14. Avoid the use of double negatives.

CRITERIA TO CONSIDER IN CONSTRUCTING GOOD TESTS

VALIDITY - the degree to which a test measures what is intended to be measured. It is the
usefulness of the test for a given purpose. It is the most important criteria of a good examination.
FACTORS influencing the validity of tests in general
0
Appropriateness of test - it should measure the abilities, skills and information it is supposed to
measure
0
Directions - it should indicate how the learners should answer and record their answers
0
Reading Vocabulary and Sentence Structure - it should be based on the intellectual level of
maturity and background experience of the learners
0
Difficulty of Items- it should have items that are not too difficult and not too easy to be able to
discriminate the bright from slow pupils
0
Construction of Items - it should not provide clues so it will not be a test on clues nor should it
be ambiguous so it will not be a test on interpretation
0
Length of Test - it should just be of sufficient length so it can measure what it is supposed to
measure and not that it is too short that it cannot adequately measure the performance we want
to measure
0
Arrangement of Items - it should have items that are arranged in ascending level of difficulty
such that it starts with the easy ones so that pupils will pursue on taking the test
0
Patterns ofAnswers - it should not allow the creation of patterns in answering the test

WAYS of Establishing Validity


0
Face Validity - is done by examining the physical appearance of the test
0
Content Validity - is done through a careful and critical examination of the objectives of the test
so that it reflects the curricular objectives
0
Criterion-related validity - is established statistically such that a set of scores revealed by a test
is correlated with scores obtained in another external predictor or measure. Has two purposes:
0
Concurrent Validity - describes the present status of the individual by correlating the sets
of scores obtained from two measures given concurrently
0
Predictive Validity - describes the future performance of an individual by correlating the
sets of scores obtained from two measures given at a longer time interval

0
Construct Validity - is established statistically by comparing psychological traits or factors that
influence scores in a test, e.g. verbal, numerical, spatial, etc.
0
Convergent Validity - is established if the instrument defines another similar trait other
than what it intended to measure (e.g. Critical Thinking Test may be correlated with
Creative Thinking Test)
0
Divergent Validity - is established if an instrument can describe only the intended trait
and not other traits (e.g. Critical Thinking Test may not be correlated with Reading
Comprehension Test)
RELIABILITY - it refers to the consistency of scores obtained by the same person when retested using
the same instrument or one that is parallel to it.

FACTORS affecting Reliability


1. Length of the test - as a general rule, the longer the test, the higher the reliability. A longer test
provides a more adequate sample of the behavior being measured and is less distorted by
chance of factors like guessing.
2. Difficulty of the test - ideally, achievement tests should be constructed such that the average
score is 50 percent correct and the scores range from zero to near perfect. The bigger the spread
of scores, the more reliable the measured difference is likely to be. A test is reliable if the
coefficient of correlation is not less than 0.85.
3. Objectivity - can be obtained by eliminating the bias, opinions or judgments of the person who
checks the test.
4. Administrability - the test should be administered with ease, clarity and uniformity so that
scores obtained are comparable. Uniformity can be obtained by setting the time limit and oral
instructions.
5. Scorability - the test should be easy to score such that directions for scoring are clear, the
scoring key is simple, provisions for answer sheets are made
6. Economy - the test should be given in the cheapest way, which means that answer sheets must
be provided so the test can be given from time to time
7. Adequacy - the test should contain a wide sampling of items to determine the educational
outcomes or abilities so that the resulting scores are representatives of the total performance in
the areas measured

Type of Reliability
Method Procedure Statistical Measure
Measure
Measure of Give a test twice to the same
Test-Retest Pearson r
stability group with any time interval
between sets from several minutes
Measure of Give parallel forms of test at the
Equivalent Forms Pearson r
equivalence same time between forms
Test-Retest with Measure of stability Give parallel forms of test with
Pearson r
Equivalent Forms and equivalence increased time intervals between
forms
Give a test once. Score equivalent Pearson r and
Split Half
halves of the test (e.g. odd-and Spearman-Brown
even numbered items) Formula
Give the test once, then correlate Kuder-Richardson
Kuder-Richardson Measure of Internal
the proportion/percentage of the
Formula 20 and 21
Consistency
students passing and not passing
Give a test once. Then estimate
Cronbach Kuder-Richardson
reliability by using the standard
Coefficient Alpha Formula 20
deviation per item and the
standard deviation of the test

8. Score the test. Arrange the scores from highest to lowest.


9. Get the top 27% (upper group) and below 27% (lower group) of the examinees.
10. Count the number of examinees in the upper group (PT) and lower group (PB) who got each
item correct.
11. Compute for the Difficulty Index of each item.
0.25 - 0.75 —♦ average 0.30 - 0.39 -► reasonably good
0 20-029 -> maroinal item
0 00 - 0 24 —> verv difficult

ITEM ANALYSIS

STEPS:
1. Score the test. Arrange the scores from highest to lowest.
2. Get the top 27% (upper group) and below 27% (lower group) of the examinees.
3. Count the number of examinees in the upper group (PT) and lower group (PB) who got ear item
correct.
4. Compute for the Difficulty Index of each item.
(PT + PB) ।-----------------------------------------------------1
Df =---------------------- I N= the total number of examinees I
N ।_________________________________1

5. Compute for the Discrimination Index.


(PT-PB) r------------------------------------.--------------------------------
Ds =----------- I n = the number of examinees in each group
n ।--------------------------------------------------------

INTERPRETATION

Difficulty Index (Df) Discrimination Index (Ds)

0.76-1.00 very easy 0.40 - above very good


0.25-0.75 average very 0.30-0.39 0.20- reasonably good
0.00-0.24 difficult 0.29 marginal item poor
0.19 - below item

SCORING ERRORS AND BIASES

0
Leniency error: Faculty tends to judge better than it really is.

0
Generosity error: Faculty tends to use high end of scale only.

0
Severity error: Faculty tends to use low end of scale only.

0
Central tendency error: Faculty avoids both extremes of the scale.

0
Bias: Letting other factors influence score (e.g., handwriting, typos)

0
Halo effect: Letting general impression of student influence rating of specific criteria (e.g., student’s
prior work)

0
Contamination effect: Judgment is influenced by irrelevant knowledge about the student or other
factors that have no bearing on performance level (e.g., student appearance)

0
Similar-to-me effect: Judging more favourably those students whom faculty see as similar to
themselves (e.g., expressing similar interests or point of view)
0
First-impression effect: Judgment is based on early opinions rather than on a complete picture (e.g.,
opening paragraph)

0
Contrast effect: Judging by comparing student against other students instead of established criteria
and standards

0
Rater drift: Unintentionally redefining criteria and standards over time or across a series of scorings
(e.g., getting tired and cranky and therefore more severe, getting tired and reading more
quickly/leniently to get the job done)
FOUR TYPES OF MEASUREMENT SCALES
Measuremen Characteristics Examples
t
Nominal Groups and labal data Gender (1-male; 2-female)

Rank data Income (1-low, 2-average, 3-


Ordinal Distance between points are high)
indefinite
Distance between points are Test scores
Interval equal
Temperature
No absolute zero
Ratio Height
Absolute zero
Weight

SHAPES OF FREQUENCY POLYGONS

1.
Normal / Bell-Shaped / Symmetrical
2.
Positively Skewed - most scores are below the mean and there are extremely high scores
3.
Negatively Skewed - most scores are above the mean and there are extremely low scores
4.
Leptokurtic - highly peaked and the tails are more elevated above the baseline
5.
Mesokurtic - moderately peaked
6.
Platykurtic - flattened peak
7.
Bimodal Curve - curve with 2 peaks or modes
8.
Polymodal Curve - curve with 3 or more modes
9.
Rectangular Distribution - there is no mode

MEASURES OF CENTRAL TENDENCY AND VARIABILITY

ASSUMPTIONS WHEN USED APPROPRIATE STATISTICAL TOOLS


MEASURES OF CENTRAL MEASURES OF VARIABILITY
TENDENCY (describes the (describes the degree of
representative value of a set of spread or dispersion of a set of
■ When the frequency data) data) - the root-
Standard Deviation
Mean - the arithmetic average
distribution is regular or mean-square of the deviations
symmetrical (normal) from the mean
■ Usually used when data are
■ When the frequency Quartile Deviation - the average
distribution is irregular or Median - the middle score in a
deviation of the 1st and 3rd
skewed group of scores that are ranked
quartiles from the median
■ Usually used when the data
■ When the distribution of Range - the difference between
Mode - the most frequent score
scores is normal and quick the highest and the lowest
answer is needed score in the distribution
■ Usually used when the data
How to Interpret the Measures of Central Tendency
10.
The value that represents a set of data will be the basis in determining whether the group is
performing better or poorer than the other groups.

How to Interpret the Standard Deviation


11.
The result will help you determine if the group is homogeneous or not.
12.
The result will also help you determine the number of students that fall below and above the
average performance.

Main points to remember:

Points above Mean + 1SD = range of above average


Mean + 1SD ~1
Mean - 1SD
Points below Mean - 1SD = range of below average

How to Interpret the Quartile Deviation


13.
The result will help you determine if the group is homogeneous or not.
14.
The result will also help you determine the number of students that fall below and above the
average performance.

Main points to remember:

Points above Median + 1QD = range of above average

Median + 1QD = give the limits of an average


ab ability \
Median - 1QD

Points below Median - 1QD = range of below average

MEASURES OF CORRELATION

Pearson r

SXY fZXYSYA Where:


X — scores in a test
r=_________N N JL N J Y — scores in a retest
N - number of examinees
(EX2 fZXY ,IY2 fSY^j
\ N N ) \ N “I N J

Spearman Brown Formula

reliability of the whole test = r°* Where:


1 + r o*
roe - reliability coefficient using split-half or
odd-even procedure

STANDARD SCORES

0
Indicate the pupil’s relative position by showing how far his raw score is above or below average
0
Express the pupil’s performance in terms of standard unit from the mean
0
Represented by the normal probability curve or what is commonly called the normal curve
0
Used to have a common unit to compare raw scores from different tests

PERCENTILE
0
tells the percentage of examines that lies below one’s score
Z-SCORES
0
tells the number of standard deviations equivalent to a given raw score
GRADES:
a. Could represent:
0
how a student is performing in relation to other students (norm-referenced grading)
0
the extent to which a student has mastered a particular body of knowledge (criterion-
referenced grading)
0
how a student is performing in relation to a teacher’s judgment of his or her
potential
b. Could be for:
0
Certification that gives assurance that a student has mastered a specific content or
achieved a certain level of accomplishment - \\<:>
0
Selection that provides basis in identifying or grouping students for certain educational paths
or programs > &O
0
Direction that provides information for diagnosis and planning
0
Motivation that emphasizes specific material or skills to be learned and helping students to
understand and improve their performance

c. Could be assigned by using:


0
Criterion-Referenced Grading - or grading based on fixed or absolute standards where grade is
assigned based on how a student has met the criteria or a well-defined objectives of a course that
were spelled out in advance. It is then up to the student to earn the grade he or she wants to
receive regardless of how other students in the class have performed. This is done by transmuting
test scores into marks or ratings.

0
Norm-Referenced Grading - or grading based on relative standards where a student’s grade
reflects his or her level of achievement relative to the performance of other students in the class.
In this system, the grade is assigned based on the average of test scores.

0
Point or Percentage Grading System whereby the teacher identifies points or percentages for
various tests and class activities depending on their importance. The total of these points will be
the bases for the grade assigned to the student.

0
Contract Grading System where each student agrees to work for a particular grade according to
agreed-upon standards.
GUIDELINES IN GRADING STUDENTS

1. Explain your grading system to the students early in the course and remind them of the grading
policies regularly.
2. Base grades on a predetermined and reasonable set of standards.
3. Base your grades on as much objective evidence as possible.
4. Base grades on the student’s attitude as well as achievement, especially at the elementary and high
school level.
5. Base grades on the student’s relative standing compared to classmates.
6. Base grades on a variety of sources.
7. As a rule, do not change grades, once computed.
8. Become familiar with the grading policy of your school and with your colleague’s standards.
9. When failing a student, closely follow school procedures.
10. Record grades on report cards and cumulative records.
11. Guard against bias in grading.
12. Keep pupils informed of their standing in the class.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy