Monday, June 13, 2016

Shining a Spotlight on the Dark Corners of the College Board: The Development Process for the Redesigned SAT

Shining a Spotlight on the Dark Corners of the College Board: The Development Process for the Redesigned SAT

We worked on the Item Specifications for the SAT for several months before they were published in April, 2014. Cyndie Schmeiser was in charge of the development of the document, but David Coleman and other executives reviewed it as well. It is worth noting that the sample items included in this document were more thoroughly reviewed than the items on the operational SAT tests. It is also worth noting that the the entire document is 210 pages, yet Appendix A (Item Development Process) is only 9 pages, with some very important task descriptions getting only a short paragraph.

The centerpiece of Appendix A is a graphical representation of the SAT development process (shown below). This graphic was included in Appendix A because it represents the industry’s best practices, but this is not how the SAT was developed. Step 4, as I’ve written before, took place only in the imaginations of the authors of the Item Specifications for the SAT.

We first implemented Step 4 in August of 2014, after thousands of items had already been developed and pretested without this crucial step. About 200 hundred items were sent to the Content Advisory Committee for review. Their feedback was scathing. One committee member wrote an 11-page document letting the College Board know that these were the worst items he had ever seen. In the past, he had not seen the worst items because they were rejected due to poor item statistics. In fact, 15-20 percent of the items that are pretested are rejected due to poor performance. Even after those rejections, the College Board still needs to include extensively revised items on operational SAT forms to meet blueprint.

How does skipping Step 4 affect students?

  • They spend up to 1/3 of their testing time on experimental sections, answering items that are potentially flawed, instead of spending time answering items that actually count towards their SAT score
  • They have to answer operational items that were extensively revised after pretesting to fix the problems that would have been fixed had the College Board not taken shortcuts.

Can the College Board talk its way out of this? They will try to do so using Step 9 (Postoperational Statistical Reviews). This is what a publication sponsored by the Council of Chief State School Officers (CCSSO) and developed in cooperation with the Technical Issues in Large-Scale Assessment (TILSA) collaborative under the leadership of Doug Rindone has to say about that:

Equating as a Repair Shop This misconception refers to the belief that by equating test forms, problems rooted in test development can be corrected. In this erroneous view, items used operationally that are later found to be problematic, based on substantive technical review, can be “equated away.” People new to assessment sometimes see equating as a sort of mathematical equalizer tool capable of absorbing a multitude of variations between two test forms: significant changes in item positioning, changes to the content standards that the items are intended to measure, and changes to the items themselves. In fact, changes such as these are not factored into the equating but instead pose real challenges—and sometimes outright threats—to validity.



Sent from my iPhone

No comments:

Post a Comment