Seminar Details

Seminar Details


Friday

Nov 7

8:00 am

R-Squared Inference Under Non-Normal Error

Lei Xu

Final Exam

Advisor: Professor Ross L. Prentice

Assessment of the relationship between diet and health status, especially association between diet and chronic disease risk, has attracted lot of research interest in statistical and epidemiologic studies. However, due to measurement errors in commonly utilized self-reported assessment approaches, an expected strong relationship was not identified in most studies. Developments in biomarker measures provide objective consumption assessment for specific dietary components which are utilized to develop calibrated dietary consumption function to remove bias embedded in those self-reported dietary measures. Researchers are interested in the explanatory strength of calibration equations and comparison of the strengths among various self-report measures. Thus, as a common metric used in these studies, reliable estimation of R2 and of its confidence interval are important. Inference on R2, including confidence interval for R2 has not attracted much attention in the statistical literature. In this dissertation we proposed two methods to estimate confidence interval for R2 under errors from normal distribution and non-normal distributions: the first method is based on asymptotic theories and entails the development of the asymptotic distribution of R2, and its relevant functions, when sample size becomes large; the second approach is based on general F-test applied to linear regression but adjusts degree of freedom parameters in the F-test statistics using empirical skewness and kurtosis of regression errors. In addition, when there are measurement errors in the independent variables, R2 directly estimated from the regression can be biased and may, for example, underestimate the relationship between dependent and independent variables even with normally distributed error. This dissertation also proposes a correction methodology to reduce the bias in R2 estimation in the presence of classical addition measurement errors. The proposed methodologies have been evaluated in simulation and applied to nutritional biomarker studies in the Women’s Health Initiative.