University of Washington - Department of Statistics
Modern datasets are often in the form of matrices or arrays, potentially having correlations along each set of data indices. For example, researchers often gather relational data measured on pairs of units, where the population of units may consist of people, genes, websites or some other set of objects. Multivariate relational data include multiple relational measurements on the same set of units, possibly gathered under different conditions or at different time points. Such data can be represented as a multiway array, or tensor.
The identification of the main features in such datasets is often accomplished via reduced-rank array decompositions. In this talk, I briefly review two popular decompositions for arrays (the PARAFAC and Tucker decompositions), and show how their forms can be incorporated into statistical models. The model-based versions extend the scope of reduced rank methods to accommodate a variety of data types, and can also be related to a type of array normal distribution, which generalizes the class of so-called "matrix normal" distributions.