Seminar Details

Seminar Details


Jun 27

1:30 pm

Bayes and Empirical Bayes Methods for Social Network Analysis

Yanjun He

General Exam

Advisor: Peter Hoff


Modern studies of social networks often involve longitudinal measurements over time. Several methods for the analysis of such data have been developed including stochastic actor-oriented models and temporal exponential random graph models (TERGMs), etc. A longitudinal network dataset will often be accompanied by longitudinal node-level attributes and it is often of interest to infer how the network and nodal attributes might influence each other over time. We develop a class of coevolution models for network and nodal attribute data that are based on simple and scalable linear regression and latent factor models. This framework is flexible and extendable, and can be modified to accommodate continuous and ordinal measurements for both the individual and network-level data. The parameters in such a model can describe three important features of such datasets: autocorrelation, homophily, and contagion.

Gibbs sampling is used to estimate the parameters and latent factors in the coevolution model. Besides fixing values for the parameters in the prior, several other methods are applicable including hierarchical Bayesian models and empirical Bayes methods. The latter requires estimating those parameters from the data. In working with probit models, this estimation is challenging because the marginal likelihood contains an intractable integral. Getting the maximum likelihood estimators involves calculating the expectation of a truncated multivariate normal distribution, which does not have a closed form in general and is computationally impractical when the dimension is large. Laplace approximation can be a useful tool to approximate the integral, except in cases such as the social relational model. As an alternative, we propose a composite marginal likelihood estimation method that works with a pseudo-likelihood based on blocks of the data. This approach provides parameter estimates and standard errors at a computational cost that does not grow with the network size.