HUB

Building Map

Three Principles of Data Science: Predictability, Computability, and Stability

Time
Speaker
Bin Yu

In this talk, I'd like to discuss the intertwining importance and connections of three principles of data science in the title and the PCS workflow that is built on the three principles.  The principles will be demonstrated in the context of two collaborative projects in neuroscience and genomics for interpretable data results and testable hypothesis generation. If time allows, I will present proposed PCS inference that includes perturbation intervals and PCS hypothesis testing. The PCS inference uses prediction screening and takes into account both data and model perturbations.

Building
Room
340

Instrumental Variable Learning of Marginal Structural Models

Time
Speaker
Eric J. Tchetgen Tchetgen

In a seminal paper, Robins (1998) introduced marginal structural models (MSMs), a general class of counterfactual models for the joint effects of time-varying treatment regimes in complex longitudinal studies subject to time-varying confounding. He established identification of MSM parameters under a sequential randomization assumption (SRA), which rules out unmeasured confounding of treatment assignment over time.

Building
Room
340

[CANCELLED] Statistical Methods for Two Problems in Biology

Time
Speaker
Daniela Witten

Note 2/7/2018: We are canceling this seminar as a precaution in anticipation of the expected Winter storm.

 

As the pace and scale of data collection continues to increase across all areas of biology, there is a growing need for effective and principled statistical methods for the analysis of the resulting data. In this talk, I'll describe two ongoing projects to help fill this gap. 

Building
Room
337

A New Standard for the Analysis and Design of Replication Studies

Time
Speaker
Leonhard Held

A new standard is proposed for the evidential assessment of replication studies. The approach combines a specific reverse-Bayes technique with prior-predictive tail probabilities to define replication success. The method gives rise to a quantitative measure for replication  success, called the sceptical p-value. The sceptical p-value integrates  traditional significance of both the original and replication study with a comparison of the respective effect sizes.

Building
Room
340

How Statistics Took Me to the Aleutian Islands

Time
Speaker
Joel Howard Reynolds

Did you know that your skills in statistics can be applied to ensure natural resources, such as fish, wildlife and even ecosystems, remain resilient into the future? That your love of algebra can take you to wild, remote, and amazing places? That there are careers where you get to collaborate with a wide variety of dedicated scientists working to better understand the world, how it is changing, and what it will be like in the future?

Building
Room
340

Bayesian Approaches to Dynamic Model Selection

Time
Speaker
Michele Guindani

In many applications, investigators monitor processes that  vary in space and time, with the goal of identifying temporally persistent and spatially localized departures from a baseline or ``normal" behavior. In this talk, I will first discuss a principled Bayesian approach for estimating time varying functional connectivity networks from brain fMRI data. Dynamic functional connectivity, i.e., the study of how interactions among brain regions change dynamically over the course of an fMRI experiment, has recently received wide interest in the neuroimaging literature.

Building
Room
332

Spectral Gap in Random Bipartite Biregular Graphs and Applications

Time
Speaker
Ioana Dumitriu

The asymptotics of the second-largest eigenvalue in random regular graphs (also referred to as the "Alon conjecture") have been computed by Joel Friedman in his celebrated 2004 paper. Recently, a new proof of this result has been given by Charles Bordenave, using the non-backtracking operator and the Ihara-Bass formula. In the same spirit, we have been able to translate Bordenave's ideas to bipartite biregular graphs in order to calculate the asymptotical value of the second-largest pair of eigenvalues, and obtained a similar spectral gap result.

Building
Room
332

Fast Inference for Spatial Generalized Linear Mixed Models

Time
Speaker
Murali Haran

Non-Gaussian spatial data arise in a number of disciplines. Examples include spatial data on disease incidences (counts), and satellite images of ice sheets (presence-absence). Spatial generalized linear mixed models (SGLMMs), which build on latent Gaussian processes or Markov random fields, are convenient and flexible models for such data and are used widely in mainstream statistics and other disciplines. For high-dimensional data, SGLMMs present significant computational challenges due to the large number of dependent spatial random effects.

Building
Room
332