Grant-Supported Research Projects in the Department


A full list of open funding opportunites (TA positions, fellowships, etc.) can be found on the department's funding page.


Available Research Assistantships in the Department (alphabetical by first PI)

Current Research Assistantships in the Department (alphabetical by first PI)




Descriptions of Research Assistantships

Spectral analysis of link data and foundations of clustering
Embedding and semisupervised learning in directed graphs/with asymmetric relationships. Understanding the fundamental, large scale topological properties of large directed graphs. We will model the large scale properties of a directed graph by the (spectral) properties of a random walk on the graph. This work will produce algorithms for mapping the nodes of a directed graph into a low-dimensional space (embedding the graph) in a way that preserves and displays its large scale structure. We will also work on algorithms for semisupervised regression and classification, for mixed data, i.e data points for which we have both measured individual attributes and graph connectivity observations, which in turn can be directed or undirected. These algorithms will be applicable to the analysis of social networks, computer and communication networks, biological interaction networks. The RA must be proficient with Matlab, have a strong background in linear algebra and mathematics in general. Knowledge of Markov random fields, stochastic processes, diffusion processes is highly desirable but not required. Funding available immediately.

PI: Marina Meila .

Gravimetric inversion using statistical learning techniques
This project pioneers a new approach to the problem of detecting underground features (like caves) from gravity measurements. The RA will have the opportunity to learn about the technological advances in this field while contributing to the development of a new method for detecting them. The new methodology combines signal processing, optimization and machine learning techniques. It is based on the recent breakthroughs known as compressed sensing. The RA will develop the optimization and wavelet transform core of the algorithm. This will be done partly by using existing software packages and partly by writing new code. Therefore, the RA must be have a good working knowledge of wavelet techniques and be familiar with optimization in general. In addition, the candidate must be an experienced Matlab and (C/C++ or Java) programmer. Funding available(immediately): estimated for 2-3 quarters, but extensions possible. The project has potential to generate a PhD thesis.

PI: Marina Meila .

Proper Scoring Rules, Calibration and Sharpness: Assessing Predictions for an Uncertain World
One of the major purposes of statistical analysis is to make forecasts for the future, and to provide suitable measures of the uncertainty associated with them. Consequently, forecasts ought to be probabilistic in nature, taking the form of probability distributions over future quantities or events. With the advent of probabilistic forecasting in a wealth of meteorological, climatological, economic and financial applications, the need for principled techniques for the comparison and evaluation of distributional forecasters is becoming critical. The project addresses this challenge by developing tools for the assessment of calibration and sharpness, and furthering insight into the construction and characterization, the properties, and the computation of proper scoring rules with desirable properties, such as kernel scores, local scores and skill scores.

PI: Tilmann Gneiting . Current RA: Roopesh Ranjan .

Stochastic Modeling of Hematopoiesis (NIH)
We apply recent Markov chain Monte Carlo techniques to studying hidden population processes of birth-and-death and compartment model types. The goal is to be able to estimate parameters in a model for the blood production in animals and humans.

PIs: Jan Abkowitz, Hematology and Peter Guttorp , Statistics. Current RA: Youyi Fong, Biostatistics.

Bayesian Methods for Multivariate Analysis
The applicant for this position should have a familiarity with Bayesian methods and be interested in working in one of the following areas:

  1. model and variable selection techniques for discrete graphical models;
  2. graphical models on copulas;
  3. the development of inference techniques for combining large, sparse heterogeneous datasets (e.g., genotype and expression data).

There is a theoretical as well as an applied component (hands-on data analysis and some software development) associated with these topics. The RA support is initially offered for a quarter but it can be extended throughout the duration of your PhD depending on your interest and performance. All three topics could evolve in PhD theses.

PI: Adrian Dobra . Current RAs:

-->

Improving cognitive outcome precision and responsiveness with modern psychometrics
The ultimate goal of this research project is to develop and implement a statistically and scientifically sound strategy for measurement of cognitive functioning in the elderly using advanced psychometrics. Although modern psychometric methods have been used for a number of years in other fields, their applications to analyses of cognitive data are scarce. Our experience suggests that improvements in cognitive measurement gained by using advanced psychometric methods compared with standard measurement techniques can be clinically important in subjects with early dementia and Alzheimer's disease.

This five-year project brings together Alzheimer's disease researchers, statisticians, biostatisticians and psychometricians. The team will work collaboratively to develop and evaluate novel methods for statistical analysis of test data in the areas of general cognition, memory, and executive functioning, focusing on optimizing point estimates (outcome precision) and estimates of change (responsiveness). Researchers involved in this project will have access to rich data sets on cognitive functioning from two longitudinal studies, as well as to neuroimaging (MRI) and biomarkers data on the same subjects.

The RA position is available starting January, 2008. The initial contract is for one year and potentially renewable. Course background in Bayesian methods, MCMC, discrete multivariate and longitudinal data analysis is desirable. To apply, please send your vitae, current transcript, and names and contact information for two references to elena@stat.washington.edu no later than November 10, 2007.

Contact: Elena Erosheva , Statistics. PI: Paul Crane, Medicine. Current RA: Jonathan Gruhl

CSSS Consulting (UW-UIF)
The RA in this position provides statistical consulting to researchers in such social sciences as economics, sociology, nursing, psychology and others. Statistical consulting typically includes assistance with the choice and application of statistical method, assistance with study planning and design, advice on data visualization and presentation, and development of specialized statistical methods. The successful applicant will benefit from having completed the 570s sequence and the Consulting courses in the Statistics Department.

PI: Elena Erosheva . Current RA: Hilary Lyons .

Survey measurement of chronic disability
The project will employ statistical modeling approaches to examine the extent to which aspects of the National Long Term Care Survey (NLTCS) design affect the measurement of chronic disability. The NLTCS, one of the best designed surveys for analyzing U.S. national disability trends, has been widely used to determine trends in disability. Yet, the impact of survey design on the chronic disability measures produced by the survey is unknown.

The RA work will involve discrete data analysis, longitudinal and latent variable modeling. The position is for one year and potentially renewable.

PI: Elena Erosheva . Current RA: Toby White .

Proper Scoring Rules, Calibration and Sharpness: Assessing Predictions for an Uncertain World
One of the major purposes of statistical analysis is to make forecasts for the future, and to provide suitable measures of the uncertainty associated with them. Consequently, forecasts ought to be probabilistic in nature, taking the form of probability distributions over future quantities or events. With the advent of probabilistic forecasting in a wealth of meteorological, climatological, economic and financial applications, the need for principled techniques for the comparison and evaluation of distributional forecasters is becoming critical. The project addresses this challenge by developing tools for the assessment of calibration and sharpness, and furthering insight into the construction and characterization, the properties, and the computation of proper scoring rules with desirable properties, such as kernel scores, local scores and skill scores.

PI: Tilmann Gneiting. Current RA: Roopesh Ranjan

Combining individual-level survey and population-level data
The project is developing statistical methods for combining surveys and population data collections (especially of births and marital and non-marital unions) for the improved estimation of these birth and childhood circumstances. The family and socio-economic circumstances of children's parents at birth and during the childrearing years are fundamental determinants of children's health and well-being. Specific aims are to (1) Develop and test statistical methods to combine multiple sources of survey data and population data; (2) Improve estimates of the parameters of fertility and marital and non-marital union regression equations, and of simulated life-course fertility and union duration measures; and (3) to expand and disseminate the statistical capabilities to the demographic community. It is shown that combining population and survey data in the estimation allows for more modeling detail than when using population data alone, and more precise estimates than when using survey data alone. Further statistical development will allow for survey data to be combined from more than one data set, thereby obtaining some of the same benefits as from combining survey and population data. Funded by the NIH.

PI: Mark S. Handcock . Current RAs: Ryan Admiraal .

Modeling the Complex Dynamics of Urban Landscape Patterns
Urban development in the United States is profoundly changing landscape pattern and biodiversity and in turn it is being affected by these changes. Yet we are just beginning to understand the interactions between patterns and processes in human dominated landscapes. One of the least understood aspects of urban landscape dynamics is the way in which local interactions of human and biophysical processes affect the landscape patterns of metropolitan regions. In this project we work with urban planners, landscape ecologists, and wildlife biologists to build models based on two longitudinal land cover and land use data sets developed for the Seattle and Phoenix Metropolitan Areas. Funded by the NSF Biocomplexity in the Environment Program.

PI: Mark S. Handcock . Current RAs: (one from Urban Planning)

Longitudinal Relational Data
The position is for a year and is potentially renewable. The focus of the RA position is to develop generative models for dynamic relational (network) data, and to test and apply these models to international relations data. However, the RA position is flexible enough to fund a wide variety of research, as long as it is related to longitudinal multivariate analysis.

A suitable candidate for this position would have a research background or (potentially concurrent) course experience that includes some combination of multivariate analysis, MCMC methods, time series and/or machine learning.

PI: Peter Hoff . Current RAs: Xiaoyue Niu .

Spatial modeling of Asian conflict
We are looking for a graduate student to work on a statistial project to predict over a six month horizon a subset of the following kinds of events in a set of 29 Asia countries: riots and rebellions, regime changes, major economic collapse, violent anti-state insurgencies, major acts of government repression, civil wars, and international crises. Most of the data are provided by the sponsor (DARPA).

Our task is to apply a spatial approach to modeling these data, with an eye toward a broader set of dependencies among them. This is a 15-month project, to begin immediately. I am looking for someone with a background in spatial statistics, and basic statistical programming, to help me in this task over the next months. This could be in the form of a regular RA position, or as an hourly position.

PI: Michael D. Ward (mdw at u dot washington dot edu), Department of Political Science. Additional contact: Peter Hoff (pdhoff at u dot washington dot edu), Department of Statistics One open RA position.

Spatial Statistics for the Verification of Weather Forecasts.
Beginning September 1, 2007, one RA position will be available for 12 months. Although the position is at the Applied Physics Laboratory, the work will be done with Caren Marzban who is also at the Statistics Dept.

These days, numerical models based on basic physics principles form a foundation for making predictions in a wide range of fields. For example, weather models produce predictions which are often presented as images of some quantity across some spatial domain. Many of these quantities can also be observed. The main question in this project is "How good are these predictions?" From an image analysis point of view, this is equivalent to asking how similar is the prediction image to the observed one. Our team has already devised methods for addressing that question employing cluster analysis. The next step in that line of research is to develop similar methods that involve other techniques from spatial statistics, specifically the analysis of variograms. Programing knowledge and experience in R is required.

PI: Caren Marzban, Statistics. Current RAs: Dustin Lennon .

Intransitive classification and choice (NSF)
The goal of this project is to develop a framework for intransitive pattern classification and models of intransitive choice. Intransitivity can arise from various forms of classifiers. This includes the augmenting of log-likelihood ratios with correction terms and collections of binary classifiers (SVMs/kernel machines, binary neural networks, etc.) when used for multi-class classifiers. Intransitivity can also arise in individual and group choice (e.g., elections, tournament-style competitions). The project is developing methods to better explain intransitivity in these classifiers and to model preference relationships in social choice. Questions being investigated by this project include: (1) why/when intransitivity occurs; (2) why/how it helps classification; (3) how to introduce intransitivity in classifier systems; (4) whether intransitivity should itself be a goal, or rather whether to treat it as an artifact of imperfection and an indication of incertitude; (5) methods to detect intransitivity; (6) how intransitivity can be used to reduce errors; (7) how to model intransitivity; (8) the relationship between intransitivity in machine learning and in sociology, psychology, voting theory, political science, operations research, mathematics, economics, and philosophy; and (9) if transitive explanations can better explain natural organisms. In addition to establishing new cross-field scientific connections, this project has a broader impact through integration of its results into new seminars, tutorial articles on intransitive decision making, and new freely-available software.

Current research focuses on creating new statistical models for preference data, together with efficient algorithms for estimating them and performing inference. We are interested in machine learning with preference data, in graphical model-like structures over the space of preferences, and in Bayesian analysis of these models. Students who are strong computationally or mathematically will be especially suited for this research.

PIs: Marina Meila , Statistics and Jeff Bilmes, Electrical Engineering. Current RAs: Bhushan Mandhani, Computer Science. One open RA position for a Statistics student.

Statistical Models for Networks with Application to Disease Transmission (Funded by the NIH)
Over the past two decades, the epidemic of HIV has challenged the epidemiological community to rethink its paradigms for understanding the risk of disease transmission, both at the individual level, and at the level of population transmission dynamics. One of the hallmarks of this research effort has been the rapid convergence of opinion that the concept of a transmission network must play an important role in the development of any new paradigm.? In its simplest form, the network perspective recognizes that people acquire infections from their partners.? Thus, it is not only a person's own behavior that puts them at risk, but the behavior of their partners, and more generally, the persons to whom they are indirectly connected by virtue being connected to their partner? In this project we are developing statistical random graph models that represent sexual and drug use networks. This is a large collaborative project with many participants within UW and worldwide. Funded by the NIH.

PIs: Martina Morris and Mark S. Handcock . Current RAs: Pavel Krivitsky .

Inference for Networks with Sampled or Missing Data (Funded by the NIH)
This project is developing statistical theory to guide the sampling of data from networks. Network sampling involves two units: nodes and links. While this can be thought of as a multi-level sampling design, the two levels are not nested in the traditional manner. We make systematic use of current network data to examine the information loss under alternative sampling strategies, and to develop the statistical theory for network sampling. In particular, we are develop the statistical theory and methods for network estimation based on partial network sampling designs. The focus is on the structure of the models, and the design and analysis of surveys to collect network data. The latter is based on issues of inference in the presence of missing data. We use a hybrid design and model-based approach. Funded by the NIH.

PIs: Martina Morris and Mark S. Handcock , Statistics. Current RAs: Krista Gile .

Wavelet methods in atmospheric sciences (NSF, NRCSE/EPA, Air Force)
Using recent tools from wavelet theory, methods for estimating temporal and spatial trends are being developed. Applications include atmospheric and environmental sciences.

PIs: Donald B. Percival , Applied Physics Laboratory, Peter Guttorp , Statistics, and Chris Bretherton, Atmospheric Sciences. Current RA:

Model-Based Clustering Methods for Medical Image Segmentation and Gene Expression Data (NIH)
Many problems in the health and medical sciences have at their core the task of finding cohesive groups of observations in data. Examples include a group of voxels in an MRI image that correspond to a tumor, genes whose expression levels track one another in a series of experiments, and tissues whose gene expression patterns are similar. The statistical method for solving this problem is cluster analysis. Most cluster analysis methods used in practice are ad hoc, but more recently the development of more formal model-based clustering methods has provided a principled framework for answering central questions such as: How many clusters are there? Which clustering method should be used? How should one deal with outliers?

The main goal of the proposed research is to develop new model-based methods for clustering problems arising in medical image segmentation and gene expression data. The major thrusts of the research will be:

  1. Model-based clustering with large numbers of variables using several alternative approaches:
    1. variable selection;
    2. basis selection;
    3. model-based clustering with dissimilarities or distances rather than observations on individual voxels, genes or samples; and
    4. Bayesian shrinkage.
  2. Automated medical image segmentation for dynamic MRI breast images.
  3. Model-based clustering for gene expression data aimed at finding
    1. groups of genes that function together; and
    2. groups of tissues, samples or experiments that have similar gene expression patterns.

PI: Adrian E. Raftery . Current RAs:

Uncertainty Assessment and Visualization in Weather Prediction and Other Deterministic Models (MURI)
An interdisciplinary team consisting of Tilmann Gneiting, faculty from Atmospheric Sciences (meteorology), Psychology, the Applied Physics Lab and Adrian Raftery as PI have just received a $5 million 5-year grant to develop ways of assessing uncertainty in numerical weather prediction models and other applications of deterministic simulation models. This will build in part on work done by current students and PhD graduates in the department. Most recently Sam Bates, and before that in the past 6 years, David Poole, Chris Volinsky, Jennifer Hoeting and Geof Givens. It involves inference for deterministic simulation models, Bayesian model averaging, spatial statistics, and visualization, as well as intense interaction and collaboration with the substantive scientists.

PIs: Adrian E. Raftery and Tilmann Gneiting . Current RAs: Veronica Berrocal and Larissa Stanberry .

Genetic Epidemiology of Complex Traits (NIH)
Since 1990, we have been developing Markov chain Monte Carlo methods (MCMC) for likelihood analysis of genetic traits observed on pedigrees. We have developed methods to localize genes and to fit models for genetically complex traits. Now, as data at multiple genetic markers across the genome become increasingly available, we develop more computationally and statistically efficient methods for joint analysis of data from dense genome screens on individuals among whom there may be multiple complex relationships. We analyze allelic associations in multilocus haplotypes at the pedigree and population levels, and their impact on linkage detection and fine-scale mapping. Current methods are being extended to handle additional trait measures such as multivariate and ordered categorical phenotypes, to incorporate more complex patterns of censoring, and to allow for missing covariate information. Methods will also be developed to analyze map accuracy, recombination heterogeneity, and genetic interference, and the impact of these factors on the localization of genes contributing to complex traits.

PI: Elizabeth Thompson . Current RAs: Audrey Qiuyan Fu , Yanming Di , Liping Tong (postdoc).

Algorithms for likelihood computations on general pedigrees.
The long-term objective of the research is to develop an efficient, extensible, modular and accessible software toolbox that would facilitate statistical methods for analyzing complex pedigrees. The toolbox will consist of novel algorithms that extend state-of-the-art algorithms from graph theory, statistics, artificial intelligence, and genetics. The specific aim of the grant is to develop an extensible software system for efficiently computing pedigree likelihoods for complex diseases in the presence of multiple DNA markers in fully general pedigrees, taking into account qualitative and quantitative traits and a variety of disease models. This type of computation is crucial in pedigree analysis both in humans (e.g. for discovering disease genes) and in animals (e.g. for controlled breeding of disease-resistant strains). Experience shows that by building on experience gained within the last decade from the study of computational probability, in particular from the theory of probabilistic networks, we can construct a software system whose functionality, speed and extensibility would be unmatched by current linkage software.

This research grant is a subcontract of a 3-year NIH grant to a colleague in the Computer Science Department of University of California Irvine. The above description is modified from the introduction to the parent UCI grant. Funding will start shortly, but is backdated to July 1, 2007, so an RA position is immediately available. The RA should have interests and strengths in algorithms for statistical and probabilistic computations in complex stochastic systems, including genetic systems, and sufficient computing expertise to interact with the core computer science components of the research.

PI: Elizabeth Thompson; one RA position immediately available.

VIGRE (NSF)
The Department of Applied Mathematics, Department of Mathematics, and the Department of Statistics were awarded a second Vertical Integration in Research and Education (VIGRE) grant for the period 2004-2009. This grant is for nearly $5 million over the next five years. The main thrust of our grant is to enhance education and research by bringing together undergraduates, graduate students, postdoctoral fellows, and faculty in cross-disciplinary research settings in order to produce better mathematical scientists for the future.

Current RAs: see the VIGRE grant page for further information.

Statistical Inverse Problems, Semiparametric Models, and Empirical Processes (Funded by the NSF)
This project involves research on empirical process methods and computational strategies for a variety of semiparametric and nonparametric models for inverse and missing data problems. Currently work is underway on competing risk models with current status observation schemes, a type of model currently of interest in connection with HIV-AIDS studies. In another direction, research is being carried out on a new family of goodness-of-fit statistics and the associated confidence bands. Recent past work on estimation under shape constraints has been carried out jointly with Piet Groeneboom in the Netherlands and recent past graduate students Fadoua Balabdaoui, Moulinath Banerjee, Leah Jager, Marloes Maathuis, and Ying Zhang.

The research involves development of basic empirical process tools and methods, and applications of these new tools and methods to statistical problems concerning semiparametric models and inverse problems. Applications include panel count data, regression models for panel count data, bivariate interval censored data of several kinds, regression models for multivariate survival data, and studies of non- and semi-parametric maximum likelihood estimators used in AIDS research, and two-phase data dependent designs.

Funded by the NSF, July 2005 through June 2008.

PI: Jon A. Wellner , Statistics. Current RAs: Marios Pavlides and Arseni Seregin