Nov 16

3:30 pm

## The Emperor\'s New Tests: A Defense of the Likelihood Ratio Criterion

### Michael Perlman

Seminar

University of Washington - Department of Statistics

In recent years, two classes of examples have appeared where the likelihood ratio tests (LRT) for certain multiparameter hypothesis-testing problems appear to be flawed. In the first class of examples involving multivariate one-sided and order-restricted alternatives, the behavior of the LRT is compared for multivariate sample points x and x' such that x' lies "deeper" in the alternative than does x, yet where the LRT accepts the null hypothesis for x' but rejects for x. Like Silvapulle (1997), we argue that this conclusion is not anomalous but correct for the usual formulation of the null hypothesis accompanying one-sided and order-restricted alternatives. The alleged anomaly of the LRT occurs only for an alternative hypothesis given an obtuse convex cone C. In this case, the reformulated null hypothesis "not C" both avoids the apparent anomaly and often is more appropriate scientifically.

The second class of examples, which includes the reformulated null hypothesis "not C" noted above as well as the so-called bioequivalence testing problem, are multiparameter hypothesis testing problems with the following common features: the null hypothesis is composite, the size a LRT is not similar and hence biased, and competing size a tests can be constucted that are less biased (even unbiased) and dominate the LRT in the sense of being everywhere more powerful! Nonetheless, we agree that the LRT is not inferior: in each case the supposedly superior test can yield unwarranted conclusions and, at best, is appropriate for only for some restrictive prior distribution. This leads to the realization that in such multiparameter hypothesis-testing problems, the concepts of biasedness and more (or most) powerful size a test may lead to undesirable statistical procedures. When these concepts conflict with statistical common sense, they, not the LR criterion, should be abandoned. We believe that the LRT remains the preferred first option for non-Bayesian parametric hypothesis testing problems.

This is joint work with Lang Wu.