Literaturnachweis - Detailanzeige
Autor/inn/en | Sykes, Robert C.; und weitere |
---|---|
Titel | Scaling Polytomous Items That Have Been Scored by Two Raters. |
Quelle | (1996), (52 Seiten)
PDF als Volltext |
Sprache | englisch |
Dokumenttyp | gedruckt; online; Monographie |
Schlagwörter | Constructed Response; High Schools; Item Response Theory; Mathematics Education; Multiple Choice Tests; Reading Instruction; Responses; Scaling; Science Education; Scores; Scoring; Standardized Tests; Test Items High school; Oberschule; Item-Response-Theorie; Mathematische Bildung; Multiple choice examinations; Multiple-choice tests, Multiple-choice examinations; Multiple-Choice-Verfahren; Leseunterricht; Scale construction; Skalenkonstruktion; Naturwissenschaftliche Bildung; Bewertung; Standadised tests; Standardisierter Test; Test content; Testaufgabe |
Abstract | The presence of multiple readings of a student response to a constructed-response item in a large-scale assessment requires a procedure for combining the ratings to obtain an item score. An alternative to the averaged item ratings that are usually used is the summing of ratings for each item. This study evaluated the effect of summing as opposed to averaging ratings in situations when both polytomous constructed-response and dichotomous selected-response (multiple choice) items were used to measure one construct and then placed on a common scale. The effects of these two aggregation methods on two item response theory models, the Rasch model, and a combination of three-parameter logistic and generalized partial credit models (the "generalized" model), were also studied. Data came from three forms of a state high school proficiency test. The effect of summing, as opposed to averaging ratings, varied across the three content areas of mathematics, reading, and science, when evaluated with the generalized model. For reading, summing reduced test information in the lower portion of the scale and increased it in the upper portions. For mathematics, the effect of summing ratings was to decrease the precision of ability estimates, and in science, summed ratings resulted in test information that was increased or decreased relative to averaged ratings in different parts of the scale. Six appendixes present supplemental information on summed ratings and item parameters for the content areas. (Contains four tables, six figures, and nine references.) (SLD) |
Erfasst von | ERIC (Education Resources Information Center), Washington, DC |