University of Washington - Statistics
We consider the problem of testing and estimation of separable covariances for relational data sets in the context of the matrix-variate normal distribution. Relational data are often represented as a square matrix, the entries of which record the relationships between pairs of objects. Many statistical methods for the analysis of such data assume some degree of similarity or dependence between objects in terms of the way they relate to each other. However, formal tests for such dependence have not been developed. We develop a likelihood ratio test (LRT) for row and column dependence based on the observation of a single relational data matrix. We obtain a reference distribution for the LRT statistic, thereby providing an exact test for the presence of row or column correlations in a square relational data matrix. Additionally, we provide extensions of the test to accommodate common features of such data, such as undefined diagonal entries, a non-zero mean, multiple observations, and deviations from normality.
The rejection of the null hypothesis by this test leads to an inference problem: how does one account for the row and column correlation that is evident in the data? The second part of this talk provides a framework for estimating the separable covariance structure in the context of a single observation from a matrix-variate normal distribution. We first describe covariance estimators in the known mean case. We concentrate on the classes of maximum likelihood estimators and maximum penalized likelihood estimators. Next we extend these results to the case of an unknown mean. In the case of the unpenalized estimators of the covariance, a one-step feasible GLS approach is presented, as the likelihood is unbounded when estimating mean parameters and full row and column covariance matrices. On the other hand, for the penalized methods an iterative estimation procedure is proposed. Theoretical guarantees for the convergence of the optimization procedure and for the unbiasedness of the estimates of the mean parameters are discussed.