Jul 3

9:30 am

## Modeling Heterogeneity Within and Between Arrays

### Bailey Fosdick

General Exam

University of Washington - Statistics

Data that can be represented in the form of an array is present in many of the social and biological sciences. In this talk we address two statistical problems concerning these data. The first problem is modeling the heterogeneity along the dimensions of an array. Previously developed models are either non-stochastic and difficult to interpret, or require a large number of parameters prohibiting likelihood based inference for some arrays. We propose a model called Kronecker structured factor analysis, where each array dimension has a covariance matrix that is specified to be unstructured, have factor analytic structure, or equal the identity. A likelihood ratio testing procedure is provided to guide specification of the covariance structure for each dimension. We demonstrate the use of this model with data from the Human Mortality Database.

The second statistical problem we consider is how to relate matrices that share repeated dimensions, or index sets. We discuss this problem in the context of relating a social network to nodal attributes. A test of association between the network and nodal attributes is formulated and a joint model is proposed. This model extends the model in Hoff (2009) to incorporate joint dependence between the network and node variables. Invariances present in the model are addressed via a simplified parameterization of the network effects. We apply this model to high school friendship networks from the National Longitudinal Study of Adolescent Health and investigate the relationship between the network and behavioral characteristics such as drinking, smoking, and grade point average.