Markov Networks, especially Gaussian graphical models and Ising models, have become a popular tool to study relationships in high-dimensional data. Variables in many data sets, however, are comprised of count data that may not be well modeled by Gaussian or multinomial distributions. Examples include high-throughput genomic sequencing data, user-ratings data, spatial incidence data, climate studies, and site visits. Existing methods for Poisson graphical models include the Poisson Markov Random Field (MRF) of Besag (1974) that places severe restrictions on the types of dependencies, only permitting negative correlations between variables.
By restricting the domain of the variables in this joint density, we introduce a Winsorized Poisson MRF which permits a rich dependence structure and whose pair-wise conditional densities closely approximate the Poisson distribution. An important consequence of our model is that it gives an analytical form for a multivariate Poisson density with rich dependencies; previous multivariate densities permitted only positive or only negative dependencies. We develop neighborhood selection algorithms to estimate network structure from high-dimensional count data by fitting graphical models based on Besag\'s MRF, our Winsorized Poisson MRF, and a local approximation to the Winsorized Poisson MRF. We also provide theoretical results illustrating the conditions under which these algorithms recover the network structure with high probability. Through simulations and an application to breast cancer microRNAs measured by next generation sequencing, we demonstrate the advantages of our methods for network recovery from count data. This is joint work with Zhandong Liu, Pradeep Ravikumar and Eunho Yang.