The Robustness of IRT-Based Vertical Scaling Methods to Violation of Unidimensionality

Autor/in	Yin, Liqun
Titel	The Robustness of IRT-Based Vertical Scaling Methods to Violation of Unidimensionality
Quelle	(2013), (182 Seiten) PDF als Volltext Verfügbarkeit Ph.D. Dissertation, University of Pittsburgh
Sprache	englisch
Dokumenttyp	gedruckt; online; Monographie
ISBN	978-1-3034-2896-8
Schlagwörter	Hochschulschrift; Dissertation; Item Response Theory; Scaling; Robustness (Statistics); Monte Carlo Methods; Tests; Grade 3; Grade 4; Grade 5; Grade 6; Grade 7; Grade 8; Test Bias + Suchen Sie Ihr Suchwort? Thesis; Dissertations; Academic thesis; Item-Response-Theorie; Scale construction; Skalenkonstruktion; Widerstandsfähigkeit; Monte-Carlo-Methode; Examination; Prüfung; Examen; School year 03; 3. Schuljahr; Schuljahr 03; School year 04; 4. Schuljahr; Schuljahr 04; School year 05; 5. Schuljahr; Schuljahr 05; School year 06; 6. Schuljahr; Schuljahr 06; School year 07; 7. Schuljahr; Schuljahr 07; School year 08; 8. Schuljahr; Schuljahr 08; Testkritik
Abstract	In recent years, many states have adopted Item Response Theory (IRT) based vertically scaled tests due to their compelling features in a growth-based accountability context. However, selection of a practical and effective calibration/scaling method and proper understanding of issues with possible multidimensionality in the test data is critical to ensure their accuracy and reliability. This study aims to use Monte Carlo simulation to investigate the robustness of various unidimensional scaling methods under different test conditions and different degrees of departure from unidimensionality in common-items nonequivalent groups design (grades 3 to 8). The main research questions answered by this research are: 1) Which calibration/scaling methods, concurrent, semi-concurrent, separate calibration with SL scaling, separate calibration with mean/sigma scaling, and pair-wise calibration, yield least biased ability estimates in the vertical scaling context? and 2) How do different degrees of multidimensionality affect use of the methods? Results indicate that various calibration and scaling methods perform very differently under different test conditions, especially when the grades are furthest away from the base grade. Under unidimensional condition, the five calibration and linking methods produced very similar results when the grades are close to the base grade 5. However, for grades 7 and 8, semi-concurrent and concurrent calibrations yielded more biased results while the results for the other three are comparable. Under multidimensional conditions, all five methods produced more biased results and the bias patterns differed across methods. In general, the more severe the multidimensionality is, the larger the biases are. Among the five methods compared, separate calibration with SL linking is the most robust to variations in multidimensionality. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.] (As Provided).
Anmerkungen	ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml
Erfasst von	ERIC (Education Resources Information Center), Washington, DC
Update	2020/1/01