The document discusses the establishment and implementation of computer-based performance assessments by the PARCC and SBAC consortia, highlighting the importance of evaluating these tests critically. It emphasizes that while technology can enhance assessments, educators must ensure that the tests genuinely measure student mastery of curriculum goals rather than merely serving as intelligence tests. Additionally, it outlines the process of applying evaluative criteria to student performance, whether through product-based assessments or behavioral observations.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
7 views2 pages
Classroom Assessment WTNTK-pages-117
The document discusses the establishment and implementation of computer-based performance assessments by the PARCC and SBAC consortia, highlighting the importance of evaluating these tests critically. It emphasizes that while technology can enhance assessments, educators must ensure that the tests genuinely measure student mastery of curriculum goals rather than merely serving as intelligence tests. Additionally, it outlines the process of applying evaluative criteria to student performance, whether through product-based assessments or behavioral observations.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
212 | chapter 8 | Performance Assessment
Lights … Action … Computers!
Two major educational assessment consortia were established in 2010: the Partnership for the Assessment of Readiness for College and Careers (PARCC) and the Smarter Balanced Assessment Consortium (SBAC). Early on, both of those federally funded collaboratives promised that a meaningful number of computer-based performance tests would be included in the assessments they intended to develop for the 2014–15 school year. The SBAC tests were, in fact, used during the 2014–2015 school year; the PARCC tests were administered one year later during the 2015–2016 school year. These performance tests were designed to take full advantage of technological advances in the presentation and scoring of computer-governed test items. Leaders of both SBAC and PARCC were confident that, because they intended to employ computer-controlled items for their new assessments, and could require students to respond to those items on computers, it would be possible to present test-takers with more challenging, yet authentic, kinds of performance tests. American educators have now had an opportunity to review the perfor- mance tests generated by PARCC and SBAC, so teachers and administrators in many states are forming their own judgments about the merits of those assess- ments. From the perspective of the item-writers who must construct computer- enhanced performance assessments, it is an exciting opportunity to create assessments simply swathed in verisimilitude (the appearance or semblance of truth). Indeed, creating such items may be only slightly more exciting to those item-writers than being able to dress up their conversations by tossing in an occasional “verisimilitude.” However, glitter alone won’t help a guitar produce good music. Educators will need to apply the same levels of evaluative scrutiny to fancy computer-generated performance tests that they previously applied to performance tests served up in paper-and-ink test booklets. That is, can we base valid interpretations about a student’s mastery of a worthwhile curricular aim on a student’s interaction with one of these glitzy, computer-governed performance tests? If the tests do, as hoped, provide evidence regarding a student’s status related to a worthwhile curricular aim, can teachers determine what sort of instructional support is needed by a student whose performance-test efforts are flawed? If a computer-based performance test is loaded with sparkle and spangles, yet turns out to be merely an intelligence test, was that performance test really worth building? When you try to decide how much confidence you should give to infer- ences based on your students’ performance-test results, this is where good, hard thinking needs to prevail. Do not defer, automatically, to the results of assess- ments based on flashy, computer-managed test results. Instead, think through carefully—for yourself—just what’s going on when students are asked to com- plete an innovative performance test. Then, thinking about “what’s going on” when arriving at a score-base inference, decide for yourself whether to trust the evidence staring you in the face.
Once you’ve selected your evaluative criteria, you then need to apply them reli- ably to the judgment of students’ responses. If the nature of the performance test task calls for students to create some sort of product, such as a written report of an experiment carried out in a biology class, then at your leisure you can rate the product’s quality in relation to the criteria you’ve identified as important. For example, if you had decided on three criteria to use in evaluating students’ reports of biology experiments, and could award from 0 to 4 points for each criterion, then you could leisurely assign from 0 to 12 points for each written report. The more clearly you understand what each evaluative criterion is, and what it means to award a different number of points on whatever scale you’ve selected, the more accurate your scores will be. Performance tests that yield stu- dent products are easier to rate because you can rate students’ responses when you’re in the mood. It is often the case with performance tests, however, that the student’s per- formance takes the form of some kind of behavior. With such performance tests, it will usually be necessary for you to observe the behavior as it takes place. To illustrate, suppose that you are an elementary school teacher whose fifth-grade students have been carrying out fairly elaborate social studies projects culmi- nating in 15-minute oral reports to classmates. Unless you have the equipment to videotape your students’ oral presentations, you’ll have to observe the oral reports and make judgments about the quality of a student’s performance as it occurs. As was true when scores were given to student products, in making evalu- ative judgments about students’ behavior, you will apply whatever criteria you’ve chosen and assign what you consider to be the appropriate number of points on the scales you are using. For some observations, you’ll find it sensible to make instant, on-the-spot quality judgments. For instance, if you are judging students’ social studies reports on the basis of (1) content, (2) organization, and (3) presentation, you might make observation-based judgments on each of those three criteria as soon as a report is finished. In other cases, your observations might incorporate a delayed evaluative approach. For instance, let’s say that you are working with students in a speech class on the elimination of “filler words and sounds,” two of the most prominent of which are starting a sentence with “well” and interjecting frequent “uh”s into a presentation. In the nonevaluative phase of the observation, you could simply count the number of “well”s and “uh”s uttered by a student. Then, at a later time, you could decide on a point allocation for the criterion “avoids filler words and sounds.” Putting it another way, systematic observations may be set up so you make immediate or delayed allocations of points for the evaluative criteria you’ve chosen. If the evaluative criteria involve qualitative factors that must be appraised more judgmentally, then on-the-spot evaluations and point assignments are typi- cally the way to go. If the evaluative criteria involve more quantitative factors, then a “count now and judge later” approach usually works better.