0% found this document useful (0 votes)
50 views44 pages

DR Luning Sun Psychometrics Module, Lecture 3 Social Sciences Research Methods Centre - SSRMC

1. Computerized adaptive tests (CATs) individually tailor tests to examinees' abilities by selecting subsequent test items based on responses to previous items. This increases accuracy and efficiency compared to standard tests. 2. A CAT requires an item bank, item parameters, an item selection method, a scoring algorithm, and termination rules. It iteratively selects the optimal next item based on the current ability estimate, scores the response, updates the ability estimate, and repeats until a stopping criterion is met. 3. Concerto is an online platform that can be used to create CATs. It allows users to build questionnaires, CATs, and provide feedback using pre-programmed nodes. CATs created in Concer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views44 pages

DR Luning Sun Psychometrics Module, Lecture 3 Social Sciences Research Methods Centre - SSRMC

1. Computerized adaptive tests (CATs) individually tailor tests to examinees' abilities by selecting subsequent test items based on responses to previous items. This increases accuracy and efficiency compared to standard tests. 2. A CAT requires an item bank, item parameters, an item selection method, a scoring algorithm, and termination rules. It iteratively selects the optimal next item based on the current ability estimate, scores the response, updates the ability estimate, and repeats until a stopping criterion is met. 3. Concerto is an online platform that can be used to create CATs. It allows users to build questionnaires, CATs, and provide feedback using pre-programmed nodes. CATs created in Concer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Dr Luning Sun

Psychometrics Module, Lecture 3


Social Sciences Research Methods Centre | SSRMC
 Introduction to CAT
 CAT in R
 CAT in Concerto
Some materials and examples come from previous workshops run by:
Michal Kosinski (Stanford University)
David Stillwell (University of Cambridge)
Chris Gibbons (Harvard University)
 Standard test is likely to contain questions
that are too easy or too difficult
◦ Classical Test Theory
◦ Item Response Theory
 Adaptively adjusting the level of the test to
individual participant:
◦ Increases the accuracy
◦ Saves time / money
◦ Prevents boredom / frustration
 Item bank and calibration (IRT model)
 Starting point
 Item selection algorithm (CAT algorithm)
 Scoring on-the-fly method
 Termination rules
And
 Item bank protection / overexposure
 Content Balancing
Start the test:
1. Ask first question, e.g. 1.0 Incorrect response Correct response
of medium difficulty
2. Correct! Probability
0.8

3. Score it
0.6 Normal distribution
4. Select next item with a
Difficulty
difficulty around the 0.4
most likely score (or with
the max information)
0.2
5. And so on…. Until the
stopping rule is reached 0.0
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0
Theta

Most likely score


Standard test to assess Kumamon

= A question from our test

Maths ability

2+2 1134 x 16
Standard test to assess Kumamon

= A question from our test

Maths ability

Kumamon’s ability
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability

8x4
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability

182 + 427
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability

1134 x 16
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability

1712 + 3218
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability

204 x 16
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability
Computer adaptive test to assess Kumamon

= A question from our test

Maths ability

Kumamon’s ability
 Maximum Fisher information (MFI)
◦ Obtain a current ability estimate
◦ Select next item that maximises information around the
current ability estimate
 Urry’s method (bOpt; in 1PL equals MFI)
◦ Obtain a current ability estimate
◦ Select next item with a difficulty closest to the current one
 Other methods:
◦ Minimum expected posterior variance (MEPV)
◦ Maximum likelihood weighted information (MLWI)
◦ Maximum posterior weighted information (MPWI)
◦ Maximum expected information (MEI)
 Randomesque approach (Kingsbury & Zara, 1989)
◦ Select >1 next best item
◦ Randomly choose from this set
 Embargo on overexposed items
 Location / Name / IP address rules

 Large item bank


 Regularly updated item bank

Kingsbury, G. G., and Zara, A. R. (1989). Procedures for


selecting items for computerized adaptive tests. Applied
Measurement in Education, 2, 359-375.
 Ascertain that all subgroups of items are used
equally
 Example:
◦ Arithmetic, Algebra and Geometry in a math test
◦ Different domains in an intelligence test
◦ Emotion recognition test

 Multidimentional CAT
 Test length (e.g.., 20 items, 15 items)

 Test time (5 minutes)

 Reliability of theta estimate (standard error)

 Other, clever stuff


𝑟𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = 1 − 𝑆𝐸 2
Alpha(0.90) =
SE(0.32)

Alpha(0.80) =
SE(0.45)

Alpha(0.70) =
SE(0.55)
1. The pool of available items is searched for
the optimal item, based on the current
estimate of the examinee's ability
2. The chosen item is presented to the
examinee, who then answers it correctly or
incorrectly
3. The ability estimate is updated, based upon
all prior answers
4. Steps 1–3 are repeated until a termination
criterion is met
• Efficiency – how many items do I need to ask before I get to a
certain level of precision

• Precision – How precise can my measurement be


• What do we need for CAT –

Item information (questions, scoring keys)


Item parameters
Item selection method
Scoring algorithm
Stopping rule
Others ……
catR package
 Women’s Mobility
◦ Item 1Go to any part of the village/town/city.
◦ Item 2Go outside the village/town/city.
◦ Item 3Talk to a man you do not know.
◦ Item 4Go to a cinema/cultural show.
◦ Item 5Go shopping.
◦ Item 6Go to a cooperative/mothers' club/other club.
◦ Item 7Attend a political meeting.
◦ Item 8Go to a health centre/hospital.
library(ltm)
my2pl<-ltm(Mobility~z1)
plot(my2pl,type="IIC")
require(catR)
c<-coef(my2pl)
itemBank <- cbind(c[,2], c[,1], 0, 1)
Choose the item to start with:
 max info around average?
plot(my2pl, type = "IIC")
plot(my2pl, type = "IIC", items=4)
 Random one?
items_administered<-c(4)
responses<-c(1)

it<-itemBank[items_administered, 1:4,drop=F]
theta<-thetaEst(it, responses)
sem<-semTheta(theta,it)

q<-nextItem(itemBank, theta=theta,out=items_administered)
q$item
Introduction

Items

Feedback

Bank

Parameters

Responses Logic

Theta
SEM
 Concerto hosting website
◦ https://hosting.concertoplatform.com/user/registration
 Sign up and log in
 Create your own server
 Start your Concerto experience
 Name
 URL
 Node:
◦ info
◦ questionnaire
◦ CAT
◦ form (save_data)
◦ feedback
 Basic questionnaire

 CES-D scale (The Center for Epidemiologic


Studies Depression Scale; Radloff, 1977)
◦ 20 items
◦ 4 response options
◦ Score above 16 indicates depression
 https://concertotest.com/luning/SSRMC/test/cesd

Radloff, L. S. (1977). The CES-D scale: A


self-report depression scale for
research in the general
population. Applied psychological
measurement, 1(3), 385-401.
 CAT – dichotomous

 Women’s Mobility
◦ 8 items in the item bank
◦ Item selection: MFI
◦ Scoring: BM
◦ Stopping: 3 items
◦ Randomesque: 1
◦ Content balancing: no
◦ Feedback:
 score$score<-round(score$theta*15+100,0)
 faceiq.icar-project.com
◦ Adaptive face detection test
◦ Adaptive emotion recognition test
◦ Adaptive abstract reasoning test
◦ And more ……
 Any questions?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy