Suche

Wo soll gesucht werden?
Erweiterte Literatursuche

Ariadne Pfad:

Inhalt

Literaturnachweis - Detailanzeige

 
Autor/inn/enSomers, Marie-Andrée; Zhu, Pei; Jacob, Robin; Bloom, Howard
InstitutionMDRC
TitelThe Validity and Precision of the Comparative Interrupted Time Series Design and the Difference-in-Difference Design in Educational Evaluation
Quelle(2013), (152 Seiten)
PDF als Volltext kostenfreie Datei Verfügbarkeit 
ZusatzinformationWeitere Informationen
Spracheenglisch
Dokumenttypgedruckt; online; Monographie
SchlagwörterQuantitative Daten; Research Design; Educational Assessment; Time; Intervals; Reading Programs; Comparative Analysis; Inferences; Statistical Bias; Scores; Validity; Accuracy; Reading Tests; Mathematics; Matched Groups; Statistical Analysis; Regression (Statistics); Pretests Posttests; Grade 3
AbstractIn this paper, we examine the validity and precision of two nonexperimental study designs (NXDs) that can be used in educational evaluation: the comparative interrupted time series (CITS) design and the difference-in-difference (DD) design. In a CITS design, program impacts are evaluated by looking at whether the treatment group deviates from its "baseline trend" by a greater amount than the comparison group. The DD design is a simplification of the CITS design--it evaluates the impact of a program by looking at whether the treatment group deviates from its "baseline mean" by a greater amount than the comparison group. The CITS design is a more rigorous design in theory, because it implicitly controls for differences in the baseline mean "and" trends between the treatment and comparison group. However, the CITS design has more stringent data requirements than the DD design: Scores must be available for at least four time points before the intervention begins in order to estimate the baseline trend, which may not always be feasible. This paper examines the properties of these two designs using the example of the federal Reading First program, as implemented in a midwestern state. The true impact of Reading First in this state is known, because program effects can be evaluated using a regression discontinuity (RD) design, which is as rigorous as a randomized experiment under certain conditions. The application of the RD design to evaluate Reading First is a special case of the design, because not only are all conditions for internal validity met, but also impact estimates appear to be generalizable to all schools. Therefore, the RD design can be used to obtain a "causal benchmark" against which to compare the impact findings obtained from the CITS or DD design and to gauge the causal validity of these two designs. We explore several specific questions related to the CITS and DD designs. First, we examine whether a well-executed CITS design and/or DD design can produce valid inferences about the effectiveness of a school-level intervention such as Reading First, in situations where it is not feasible to choose comparison schools in the same districts as the treatment schools (which is recommended in the matching literature). Second, we explore the trade-off between bias reduction and precision loss across different methods of selecting comparison groups for the CITS/DD designs (for example, one-to-one versus one-to-many matching, and matching with replacement versus without replacement). Third, we examine whether matching the comparison schools on pre-intervention test scores "only" is sufficient for producing causally valid impact estimates, or whether bias can be further reduced by also matching on baseline demographic characteristics (in addition to baseline test scores). And fourth, we examine how the CITS design performs relative to the DD design, with respect to bias and precision. Estimated bias in this paper is defined as the difference between the RD impact estimate and the CITS/DD impact estimates. Overall, we find no evidence that the CITS and DD designs produce biased estimates of Reading First impacts, even though choosing comparison schools from the same districts as the treatment schools was not possible. We conclude that all comparison group selection methods provide causally valid estimates but that estimates from the radius matching method (described in the paper) are substantially more precise due to the larger sample size it can produce. We find that matching on demographic characteristics (in addition to pretest scores) does not further reduce bias. And finally, we find that both the CITS and DD designs appear to produce causally valid inferences about program impacts. However, because our analyses are based on an especially strong (and possibly atypical) application of the CITS and DD designs, these findings may not be generalizable to other contexts. The following are appended: (1) Specification Tests for the Regression Discontinuity Design; (2) Minimum Detectable Effect Size for Nonexperimental Designs; (3) Characteristics of Comparison Groups; (4) CITS and DD Impact Estimates; (5) Statistical Tests of Differences Between Impact Estimates; and (6) Propensity-Score Matching Versus Direct Matching. (As Provided).
AnmerkungenMDRC. 16 East 34th Street 19th Floor, New York, NY 10016-4326. Tel: 212-532-3200; Fax: 212-684-0832; e-mail: publications@mdrc.org; Web site: http://www.mdrc.org
Erfasst vonERIC (Education Resources Information Center), Washington, DC
Update2017/4/10
Literaturbeschaffung und Bestandsnachweise in Bibliotheken prüfen
 

Standortunabhängige Dienste
Da keine ISBN zur Verfügung steht, konnte leider kein (weiterer) URL generiert werden.
Bitte rufen Sie die Eingabemaske des Karlsruher Virtuellen Katalogs (KVK) auf
Dort haben Sie die Möglichkeit, in zahlreichen Bibliothekskatalogen selbst zu recherchieren.
Tipps zum Auffinden elektronischer Volltexte im Video-Tutorial

Trefferlisten Einstellungen

Permalink als QR-Code

Permalink als QR-Code

Inhalt auf sozialen Plattformen teilen (nur vorhanden, wenn Javascript eingeschaltet ist)

Teile diese Seite: