Apr 12

3:30 pm

## Statistical Test Procedures for Unreplicated Bland-Altman Method Comparison Plots

### Kevin Hayes

Seminar

University of Limerick

Altman and Bland (1983) criticise the use of correlation, regression and differences between means when analysing data which arises from the experimental comparison of two techniques or methods of measurement. They propose a simple graphical technique based on a plot of case-wise differences between methods against case-wise means of the methods, hereafter referred to as the Bland-Altman method comparison plot. All case-wise differences between two methods showing good agreement are expected to fall within the limits of agreement set at plus or minus 2 standard deviations of the average difference. Ryan and Woodall (2005) report that the subsequent Lancet paper by Bland and Altman (1986) is the sixth most highly cited statistical paper ever. The Bland-Altman method has become the expected (often obligatory) approach for presenting determinations of method reliability in many scientific journals (Hollis, 1996, for example). The successful impact of this paper is perhaps, in part, due to the fact that only an informal inspection of the graphical method is required supplemented by the correlation coefficient of the plotted quantities. Surprisingly, the Bland-Altman methodology does not recommend any statistical protocol based on statistical testing for the purpose of distinguishing between the issues of bias and lack of precision.

This talk considers the problem of testing u1 = u2 and o2/1 = o2/2 using a random sample from a bivariate normal distribution with parameters (u1; u2; o2/1; o2/2; p). The new contribution is a decomposition of the Bradley-Blackwood test statistic (Bradley and Blackwood, 1989)for the simultaneous test of {u1 = u2; o2/1 = o2/2} as a sum of two statistics. One is equivalent to the Pitman-Morgan (Pitman, 1939; Morgan, 1939) test statistic for o2/1 = o2/2 and the other one is a new alternative to the standard paired-t test of uD = u1 - u2 = 0. Surprisingly, the classic Student paired-t test makes no assumptions about the equality (or otherwise) of the variance parameters. The power functions for these tests are quite easy to derive, and show that when o2/1 = o2/2, the paired t-test has a slight advantage over the new alternative in terms of power, but when o2/1 != o2/2, the new test has substantially higher power than the paired-t test.

While Bradley and Blackwood provide a test on the joint hypothesis of equal means and equal variances their regression based approach does not separate these two issues. The rejection of the joint hypothesis may be due to two groups with unequal means and unequal variances; unequal means and equal variances, or equal means and unequal variances. We propose an approach for resolving this (model selection) problem in a manner controlling the magnitudes of the relevant type I error probabilities.