Evaluating the Construct Validity of an Automated Writing Evaluation System with a Randomization Algorithm

Autor/inn/en	Myers, Matthew C.; Wilson, Joshua
Titel	Evaluating the Construct Validity of an Automated Writing Evaluation System with a Randomization Algorithm
Quelle	In: International Journal of Artificial Intelligence in Education, 33 (2023) 3, S.609-634 (26 Seiten)Infoseite zur Zeitschrift PDF als Volltext Verfügbarkeit
Zusatzinformation	ORCID (Myers, Matthew C.) ORCID (Wilson, Joshua)
Sprache	englisch
Dokumenttyp	gedruckt; online; Zeitschriftenaufsatz
ISSN	1560-4292
DOI	10.1007/s40593-022-00301-6
Schlagwörter	Construct Validity; Automation; Writing Evaluation; Algorithms; Scoring; Persuasive Discourse; Essays; Middle School Students; Grade 7; Grade 8; Programming Languages; Scores; Sentences; Concept Formation; Text Structure; Formative Evaluation; Feedback (Response); Computer Assisted Testing + Suchen Sie Ihr Suchwort? Algorithm; Algorithmus; Bewertung; Persuasion; Persuasive Kommunikation; Essay; Aufsatzunterricht; Middle school; Middle schools; Student; Students; Mittelschule; Mittelstufenschule; Schüler; Schülerin; School year 07; 7. Schuljahr; Schuljahr 07; School year 08; 8. Schuljahr; Schuljahr 08; Sentence analysis; Satzanalyse; Concept learning; Begriffsbildung; Textstruktur
Abstract	This study evaluated the construct validity of six scoring traits of an automated writing evaluation (AWE) system called "MI Write." Persuasive essays (N = 100) written by students in grades 7 and 8 were randomized at the sentence-level using a script written with Python's NLTK module. Each persuasive essay was randomized 30 times (n = 3000 total randomizations), and the mean trait scores for each set of randomized iterations were compared to those of the control text across all traits. We were specifically interested in evaluating the effects of randomization on the high-level traits of "idea development" and "organization." Given the rubrics and qualitative feedback provided by MI Write, we hypothesized that these high-level traits ought to be sensitive to sentence-level randomization (i.e., scores should decrease). Overall, complete randomizations did not consistently significantly impact trait scoring for these high-level writing traits. In fact, more than a third of the essays saw significant increases in one or both high-level traits despite randomization, indicating a disconnect between MI Write's formative feedback and its underlying constructs. Findings have implications for consumers and developers of AWE. (As Provided).
Anmerkungen	Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link.springer.com/
Erfasst von	ERIC (Education Resources Information Center), Washington, DC
Update	2024/1/01