Harvard University - Department of Statistics
The supposed decline of the U.S. educational system, including its causes and solutions, has been a popular topic of debate in recent years. Part of the difficulty in resolving this debate is the lack of solid empirical evidence regarding the true impact of educational initiatives. For example, educational researchers rarely are able to engage in controlled, randomized experiments. The efficacy of so-called "school choice" programs has been a particularly contentious issue. A current multi-million dollar evaluation of the New York School Choice Scholarship Program (NYSCSP) endeavors to shed some light on this issue. This study can be favorably contrasted with other school choice evaluations in terms of the thought that went into the randomized experimental design (a completely new design, the Propensity Matched Pairs Design, is being implemented) and the rigorous data collection and compliance-encouraging efforts. At first appearance, it would appear that the evaluation of the NYSCSP could proceed without undue statistical complexity. Unfortunately, this program evaluation, as is common in studies with human subjects, suffers from some complications:
a) Non-compliance. Approximately 25% of children who were awarded scholarships decided not to use them and approximately 10% of those not awarded scholarship attended private school anyway.
b) Missing data. Some parents failed to complete fully survey information. Some children were did not take pre-tests. Some children failed to show up for post-tests. Levels of missing data range approximately from 3 to 50% across variables.
Work by Frangakis and Rubin (1997) has revealed the severe threats to valid estimates of experimental effects that can exist in the presence of non-compliance and missing data, even for estimation of simple intention-to-treat effects.
The evaluation of the NY School Choice Scholarship Program is a prime example of the bridge between observational studies and randomized designs. Non-compliance and missing data force researchers to make assumptions similar to those made in good observational studies in order to return it to the ideal of perfectly controlled randomized experiment template. The Bayesian methodology used is an extension of the standard IV methodology which also allows for proper treatment of missing data (with more reasonable assumptions than those needed for naive treatment of missing data).