Pennsylvania State University - Department of Statistics
We assume that there is a real experiment, with a real sample size "N". We ask, could it be useful to imagine that the experiment was conducted at a different sample size "n". Provided that n < N, it is very easy to construct a large pool of such experiments by "subsampling", picking n out of the N observations to use in the hypothetical experiment.
There is a simple heuristic. We have only one experiment of size N. But we can easily recreate many possible experiments at a size n < N. Thus surely we can say much more about future experiments at the smaller sample size n, at least in the completely nonparametric sense.
Our initial interest in this question comes from the following question: we have an experiment with N data points, and a model that is simple, but is clearly false. Is it still useful? We answer this question at first by asking: at what fictional sample size n* would we find it completely acceptable?
We then turn to a different set of applications. In a variety of settings, we are in a situation where both asymptotic methods and bootstrapping methods are unreliable. The justification of both methods relies on the asymptotic expansion of statistics being "lower order". However, subsampling methods (n out of N) perform quite well. Can we have our cake and eat it too? We consider methods that combine subsampling, at fictional sizes n, with asymptotic theory, to produce results useful at sample size N.