A full list of open funding opportunites (TA positions, fellowships, etc.) can be found on the department's funding page.
Available Research Assistantships in the Department (alphabetical by first PI)
- Spectral analysis of link data and foundations of clustering
Marina Meila. One open RA position. - Gravimetric inversion using statistical learning techniques
Marina Meila. One open RA position. - Proper Scoring Rules, Calibration and Sharpness: Assessing Predictions for an Uncertain World
Tilmann Gneiting. RA: Roopesh Ranjan. One open RA position for summer quarter 2008. -
Intransitive Classification and Choice (NSF)
PIs: Marina Meila, Statistics; Jeff Bilmes, Electrical Engineering. One open RA position. -
Algorithms for likelihood computations on general pedigrees
PI: Elizabeth Thompson. One open RA position. -
NIH Genome Training Grant
Stat/biostat coordinator: Elizabeth Thompson. Interested students must submit the application materials listed on the grant web site (link above). Currently one (possibly two) RA positions are available. Only U.S. citizens and those with green-cards are eligible for these positions. Additional information for Stat applicants can be found on this site. -
NIH Statistical Genetics Training Grant
Stat coordinator: Elizabeth Thompson. Currently two RA positions are available. Only U.S. citizens and those with green-cards are eligible for these positions.
Current Research Assistantships in the Department (alphabetical by first PI)
- Stochastic Modeling of Hematopoiesis
(Biostatistics)
PIs: Jan Abkowitz and Peter Guttorp. Current RAs: Youyi Fong (Biostat). - Improving Cognitive
Outcome Precision and Responsiveness with Modern Psychometrics
Elena Erosheva. Current RAs: Jonathan Gruhl. - CSSS consulting
PI: Elena Erosheva. Current RA: Hil Lyons. - Modelling longitudinal disability survey data
PI: Elena Erosheva. Current RA: Toby White. - Proper Scoring Rules, Calibration
and Sharpness: Assessing Predictions for an Uncertain World
PI: Tilmann Gneiting. Current RA: Roopesh Ranjan. - Combining individual-level survey and
population-level data
PI: Mark S. Handcock. Current RAs: Ryan Admiraal - Modeling the Complex Dynamics of Urban Landscape
Patterns
PI: Mark S. Handcock. Current RAs: (one from Urban Planning) - Longitudinal relational data
PI: Peter Hoff. Current RAs: Xiaoyue Niu. -
Spatial Statistics for the Verification of Weather Forecasts
PI: Caren Marzban. Current RAs: Dustin Lennon - Inference for Networks with Sampled or Missing
Data
PIs: Martina Morris and Mark S. Handcock. Current RAs: Krista Gile - Statistical Models for Networks with Application to Disease
Transmission (NIH)
PIs: Martina Morris and Mark S. Handcock. Current RAs: Pavel Krivitsky - Wavelet methods in atmospheric sciences
PIs: Donald Percival, Peter Guttorp and Chris Bretherton (Atmospheric Sciences). Current RAs: - Model-based clustering methods for medical
image segmentation and gene expression data
PI: Adrian Raftery. Current RAs: - Uncertainty assessment and visualization in
weather prediction and other deterministic models
PIs: Adrian Raftery and Tilmann Gneiting. Current RAs: Larissa Stanberry. - Genetic epidemiology of complex traits
PI: Elizabeth Thompson. Current RAs: Audrey Fu, Yanming Di and Liping Tong (postdoc). - VIGRE
Current RAs: TBA. - Statistical Inverse Problems, Semiparametric
Models, and Empirical Processes
PI: Jon Wellner. Current RAs: Marios Pavlides and Arseni Seregin.
Descriptions of Research Assistantships
Spectral analysis of link data and foundations of clustering
Embedding and semisupervised learning in directed
graphs/with asymmetric relationships. Understanding the fundamental,
large scale topological properties of large directed graphs. We will
model the large scale properties of a directed graph by the (spectral)
properties of a random walk on the graph.
This work will produce algorithms for
mapping the nodes of a directed graph into a low-dimensional space
(embedding the graph) in a way that preserves and displays its large
scale structure. We will also work on algorithms for semisupervised
regression and classification, for mixed data, i.e data points
for which we have both measured individual attributes and graph
connectivity observations, which in turn can be directed or
undirected. These algorithms will be applicable to the analysis of
social networks, computer and communication networks, biological
interaction networks.
The RA must be proficient with Matlab, have a strong background in
linear algebra and mathematics in general. Knowledge of Markov random
fields, stochastic processes, diffusion processes is highly desirable
but not required. Funding available immediately.
PI: Marina Meila .
Gravimetric inversion using statistical learning techniques
This project pioneers a new approach to the problem of detecting
underground features (like caves) from gravity measurements. The RA
will have the opportunity to learn about the technological advances in
this field while contributing to the development of a new method for
detecting them. The new methodology combines signal processing,
optimization and machine learning techniques. It is based on the
recent breakthroughs known as compressed sensing.
The RA will develop the optimization and wavelet transform core of the
algorithm. This will be done partly by using existing software
packages and partly by writing new code. Therefore, the RA must be
have a good working knowledge of wavelet techniques and be familiar
with optimization in general. In addition, the candidate must be an
experienced Matlab and (C/C++ or Java) programmer.
Funding available(immediately): estimated for 2-3 quarters, but
extensions possible. The project has potential to generate a PhD
thesis.
PI: Marina Meila .
Proper Scoring Rules, Calibration and Sharpness: Assessing Predictions for an Uncertain World
One of the major purposes of statistical analysis is to make forecasts for the future, and to provide
suitable measures of the uncertainty associated with them. Consequently, forecasts ought to be probabilistic
in nature, taking the form of probability distributions over future quantities or events. With
the advent of probabilistic forecasting in a wealth of meteorological, climatological, economic
and financial applications, the need for principled techniques for the comparison and evaluation
of distributional forecasters is becoming critical. The project addresses this challenge by developing
tools for the assessment of calibration and sharpness, and furthering insight into the construction
and characterization, the properties, and the computation of proper scoring rules with desirable
properties, such as kernel scores, local scores and skill scores.
PI: Tilmann Gneiting . Current RA: Roopesh Ranjan .
Stochastic Modeling of Hematopoiesis (NIH)
We apply recent Markov chain Monte Carlo techniques to studying hidden
population processes of birth-and-death and compartment model types.
The goal is to be able to estimate parameters in a model for the blood
production in animals and humans.
PIs: Jan Abkowitz, Hematology and Peter Guttorp , Statistics. Current RA: Youyi Fong, Biostatistics.
Bayesian Methods for Multivariate Analysis
The applicant for this position should have a familiarity
with Bayesian methods and be interested in working in one of
the following areas:
- model and variable selection techniques for discrete graphical models;
- graphical models on copulas;
- the development of inference techniques for combining large, sparse heterogeneous datasets (e.g., genotype and expression data).
There is a theoretical as well as an applied component (hands-on data analysis and some software development) associated with these topics. The RA support is initially offered for a quarter but it can be extended throughout the duration of your PhD depending on your interest and performance. All three topics could evolve in PhD theses.
PI: Adrian Dobra . Current RAs:
-->
Improving cognitive outcome precision and responsiveness with modern
psychometrics
The ultimate goal of this research project is to develop and implement a
statistically and scientifically sound strategy for measurement of cognitive
functioning in the elderly using advanced psychometrics. Although modern
psychometric methods have been used for a number of years in other fields,
their applications to analyses of cognitive data are scarce. Our experience
suggests that improvements in cognitive measurement gained by using advanced
psychometric methods compared with standard measurement techniques can be
clinically important in subjects with early dementia and Alzheimer's
disease.
This five-year project brings together Alzheimer's disease researchers, statisticians, biostatisticians and psychometricians. The team will work collaboratively to develop and evaluate novel methods for statistical analysis of test data in the areas of general cognition, memory, and executive functioning, focusing on optimizing point estimates (outcome precision) and estimates of change (responsiveness). Researchers involved in this project will have access to rich data sets on cognitive functioning from two longitudinal studies, as well as to neuroimaging (MRI) and biomarkers data on the same subjects.
The RA position is available starting January, 2008. The initial contract is for one year and potentially renewable. Course background in Bayesian methods, MCMC, discrete multivariate and longitudinal data analysis is desirable. To apply, please send your vitae, current transcript, and names and contact information for two references to elena@stat.washington.edu no later than November 10, 2007.
Contact: Elena Erosheva , Statistics. PI: Paul Crane, Medicine. Current RA: Jonathan Gruhl
CSSS Consulting (UW-UIF)
The RA in this position provides statistical consulting to researchers
in such social sciences as economics, sociology, nursing, psychology
and others. Statistical consulting typically includes assistance with
the choice and application of statistical method, assistance with study
planning and design, advice on data visualization and presentation, and
development of specialized statistical methods. The successful applicant
will benefit from having completed the 570s sequence and the Consulting
courses in the Statistics Department.
PI: Elena Erosheva . Current RA: Hilary Lyons .
Survey measurement of chronic disability
The project will employ statistical modeling approaches to examine the extent to
which aspects of the National Long Term Care Survey (NLTCS) design affect the
measurement of chronic disability. The NLTCS, one of the best designed surveys for
analyzing U.S. national disability trends, has been widely used to determine trends
in disability. Yet, the impact of survey design on the chronic disability measures
produced by the survey is unknown.
The RA work will involve discrete data analysis, longitudinal and latent variable modeling. The position is for one year and potentially renewable.
PI: Elena Erosheva . Current RA: Toby White .
Proper Scoring Rules, Calibration and Sharpness:
Assessing Predictions for an Uncertain World
One of the major purposes of statistical analysis is to make forecasts
for the future, and to provide suitable measures of the uncertainty
associated with them. Consequently, forecasts ought to be probabilistic
in nature, taking the form of probability distributions over future
quantities or events. With the advent of probabilistic forecasting
in a wealth of meteorological, climatological, economic and financial
applications, the need for principled techniques for the comparison
and evaluation of distributional forecasters is becoming critical.
The project addresses this challenge by developing tools for the
assessment of calibration and sharpness, and furthering insight into the
construction and characterization, the properties, and the computation
of proper scoring rules with desirable properties, such as kernel scores,
local scores and skill scores.
PI: Tilmann Gneiting. Current RA: Roopesh Ranjan
Combining individual-level survey and population-level data
The project is developing statistical methods for combining surveys
and population data collections (especially of births and marital and
non-marital unions) for the improved estimation of these birth and
childhood circumstances. The family and socio-economic circumstances
of children's parents at birth and during the childrearing years are
fundamental determinants of children's health and well-being. Specific
aims are to (1) Develop and test statistical methods to combine multiple
sources of survey data and population data; (2) Improve estimates of the
parameters of fertility and marital and non-marital union regression
equations, and of simulated life-course fertility and union duration
measures; and (3) to expand and disseminate the statistical capabilities
to the demographic community. It is shown that combining population
and survey data in the estimation allows for more modeling detail than
when using population data alone, and more precise estimates than when
using survey data alone. Further statistical development will allow for
survey data to be combined from more than one data set, thereby obtaining
some of the same benefits as from combining survey and population data.
Funded by the NIH.
PI: Mark S. Handcock . Current RAs: Ryan Admiraal .
Modeling the Complex Dynamics of Urban Landscape Patterns
Urban development in the United States is profoundly changing landscape
pattern and biodiversity and in turn it is being affected by these
changes. Yet we are just beginning to understand the interactions
between patterns and processes in human dominated landscapes. One of
the least understood aspects of urban landscape dynamics is the way in
which local interactions of human and biophysical processes affect the
landscape patterns of metropolitan regions. In this project we work
with urban planners, landscape ecologists, and wildlife biologists to
build models based on two longitudinal land cover and land use data sets
developed for the Seattle and Phoenix Metropolitan Areas. Funded by the
NSF Biocomplexity in the Environment Program.
PI: Mark S. Handcock . Current RAs: (one from Urban Planning)
Longitudinal Relational Data
The position is for a year and is potentially renewable. The focus of
the RA position is to develop generative models for dynamic relational
(network) data, and to test and apply these models to international
relations data. However, the RA position is flexible enough to fund
a wide variety of research, as long as it is related to longitudinal
multivariate analysis.
A suitable candidate for this position would have a research background or (potentially concurrent) course experience that includes some combination of multivariate analysis, MCMC methods, time series and/or machine learning.
PI: Peter Hoff . Current RAs: Xiaoyue Niu .
Spatial modeling of Asian conflict
We are looking for a graduate student to work on a statistial project to
predict over a six month horizon a subset of the following kinds of events
in a set of 29 Asia countries: riots and rebellions, regime changes, major
economic collapse, violent anti-state insurgencies, major acts of government
repression, civil wars, and international crises. Most of the data are
provided by the sponsor (DARPA).
Our task is to apply a spatial approach to modeling these data, with an eye toward a broader set of dependencies among them. This is a 15-month project, to begin immediately. I am looking for someone with a background in spatial statistics, and basic statistical programming, to help me in this task over the next months. This could be in the form of a regular RA position, or as an hourly position.
PI: Michael D. Ward (mdw at u dot washington dot edu), Department of Political Science. Additional contact: Peter Hoff (pdhoff at u dot washington dot edu), Department of Statistics One open RA position.
Spatial Statistics for the Verification of Weather Forecasts.
Beginning September 1, 2007, one RA position will be available for 12
months.
Although the position is at the Applied Physics Laboratory, the work will
be done with Caren Marzban who is also at the Statistics Dept.
These days, numerical models based on basic physics principles form a foundation for making predictions in a wide range of fields. For example, weather models produce predictions which are often presented as images of some quantity across some spatial domain. Many of these quantities can also be observed. The main question in this project is "How good are these predictions?" From an image analysis point of view, this is equivalent to asking how similar is the prediction image to the observed one. Our team has already devised methods for addressing that question employing cluster analysis. The next step in that line of research is to develop similar methods that involve other techniques from spatial statistics, specifically the analysis of variograms. Programing knowledge and experience in R is required.
PI: Caren Marzban, Statistics. Current RAs: Dustin Lennon .
Intransitive classification and choice (NSF)
The goal of this project is to develop a framework for intransitive
pattern classification and models of intransitive choice. Intransitivity
can arise from various forms of classifiers. This includes the
augmenting of log-likelihood ratios with correction terms and
collections of binary classifiers (SVMs/kernel machines, binary neural
networks, etc.) when used for multi-class classifiers. Intransitivity
can also arise in individual and group choice (e.g., elections,
tournament-style competitions). The project is developing methods to
better explain intransitivity in these classifiers and to model
preference relationships in social choice. Questions being investigated
by this project include: (1) why/when intransitivity occurs; (2) why/how
it helps classification; (3) how to introduce intransitivity in
classifier systems; (4) whether intransitivity should itself be a goal,
or rather whether to treat it as an artifact of imperfection and an
indication of incertitude; (5) methods to detect intransitivity; (6) how
intransitivity can be used to reduce errors; (7) how to model
intransitivity; (8) the relationship between intransitivity in machine
learning and in sociology, psychology, voting theory, political science,
operations research, mathematics, economics, and philosophy; and (9) if
transitive explanations can better explain natural organisms. In
addition to establishing new cross-field scientific connections, this
project has a broader impact through integration of its results into new
seminars, tutorial articles on intransitive decision making, and new
freely-available software.
Current research focuses on creating new statistical models for preference data, together with efficient algorithms for estimating them and performing inference. We are interested in machine learning with preference data, in graphical model-like structures over the space of preferences, and in Bayesian analysis of these models. Students who are strong computationally or mathematically will be especially suited for this research.
PIs: Marina Meila , Statistics and Jeff Bilmes, Electrical Engineering. Current RAs: Bhushan Mandhani, Computer Science. One open RA position for a Statistics student.
Statistical Models for Networks with Application
to Disease Transmission (Funded by the NIH)
Over the past two decades, the epidemic of HIV has challenged the
epidemiological community to rethink its paradigms for understanding the
risk of disease transmission, both at the individual level, and at the
level of population transmission dynamics. One of the hallmarks of this
research effort has been the rapid convergence of opinion that the concept
of a transmission network must play an important role in the development
of any new paradigm.? In its simplest form, the network perspective
recognizes that people acquire infections from their partners.? Thus,
it is not only a person's own behavior that puts them at risk, but the
behavior of their partners, and more generally, the persons to whom they
are indirectly connected by virtue being connected to their partner? In
this project we are developing statistical random graph models that
represent sexual and drug use networks. This is a large collaborative
project with many participants within UW and worldwide. Funded by
the NIH.
PIs: Martina Morris and Mark S. Handcock . Current RAs: Pavel Krivitsky .
Inference for Networks with Sampled or Missing Data (Funded
by the NIH)
This project is developing statistical theory to guide the sampling
of data from networks. Network sampling involves two units: nodes and
links. While this can be thought of as a multi-level sampling design, the
two levels are not nested in the traditional manner. We make systematic
use of current network data to examine the information loss under
alternative sampling strategies, and to develop the statistical theory for
network sampling. In particular, we are develop the statistical theory
and methods for network estimation based on partial network sampling
designs. The focus is on the structure of the models, and the design
and analysis of surveys to collect network data. The latter is based
on issues of inference in the presence of missing data. We use a hybrid
design and model-based approach. Funded by the NIH.
PIs: Martina Morris and Mark S. Handcock , Statistics. Current RAs: Krista Gile .
Wavelet methods in atmospheric sciences
(NSF, NRCSE/EPA, Air Force)
Using recent tools from wavelet theory, methods for estimating temporal and
spatial trends are being developed. Applications include atmospheric and
environmental sciences.
PIs: Donald B. Percival , Applied Physics Laboratory, Peter Guttorp , Statistics, and Chris Bretherton, Atmospheric Sciences. Current RA:
Model-Based Clustering Methods for Medical Image
Segmentation and Gene Expression Data (NIH)
Many problems in the health and medical sciences have at their core
the task of finding cohesive groups of observations in data.
Examples include a group of voxels in an MRI image that correspond to a tumor,
genes whose expression levels track one another in a series of
experiments, and tissues whose gene expression patterns are similar.
The statistical method for solving this problem is
cluster analysis. Most cluster analysis methods used in practice are
ad hoc, but more recently the development of more formal
model-based clustering methods has provided a principled framework
for answering central questions such as: How many clusters are there?
Which clustering method should be used? How should one deal with outliers?
The main goal of the proposed research is to develop new model-based methods for clustering problems arising in medical image segmentation and gene expression data. The major thrusts of the research will be:
- Model-based clustering with large numbers of variables using
several alternative approaches:
- variable selection;
- basis selection;
- model-based clustering with dissimilarities or distances rather than observations on individual voxels, genes or samples; and
- Bayesian shrinkage.
- Automated medical image segmentation for dynamic MRI breast images.
- Model-based clustering for gene expression data aimed at finding
- groups of genes that function together; and
- groups of tissues, samples or experiments that have similar gene expression patterns.
PI: Adrian E. Raftery . Current RAs:
Uncertainty Assessment and Visualization in Weather Prediction
and Other Deterministic Models (MURI)
An interdisciplinary team consisting of Tilmann Gneiting, faculty from
Atmospheric Sciences (meteorology), Psychology, the Applied Physics Lab
and Adrian Raftery as PI have just received a $5 million 5-year grant to
develop ways of assessing uncertainty in numerical weather prediction
models and other applications of deterministic simulation models.
This will build in part on work done by current students and PhD graduates
in the department. Most recently Sam Bates, and before that in the past
6 years, David Poole, Chris Volinsky, Jennifer Hoeting and Geof Givens.
It involves inference for deterministic simulation models, Bayesian
model averaging, spatial statistics, and visualization, as well as
intense interaction and collaboration with the substantive scientists.
PIs: Adrian E. Raftery and Tilmann Gneiting . Current RAs: Veronica Berrocal and Larissa Stanberry .
Genetic
Epidemiology of Complex Traits (NIH)
Since 1990, we have been developing Markov chain Monte Carlo methods (MCMC)
for likelihood analysis of genetic traits observed on pedigrees. We have
developed methods to localize genes and to fit models for genetically
complex traits. Now, as data at multiple genetic markers across the genome
become increasingly available, we develop more computationally and
statistically efficient methods for joint analysis of data from dense genome
screens on individuals among whom there may be multiple complex
relationships. We analyze allelic associations in multilocus haplotypes at
the pedigree and population levels, and their impact on linkage detection
and fine-scale mapping. Current methods are being extended to handle
additional trait measures such as multivariate and ordered categorical
phenotypes, to incorporate more complex patterns of censoring, and to allow
for missing covariate information. Methods will also be developed to analyze
map accuracy, recombination heterogeneity, and genetic interference, and the
impact of these factors on the localization of genes contributing to complex
traits.
PI: Elizabeth Thompson . Current RAs: Audrey Qiuyan Fu , Yanming Di , Liping Tong (postdoc).
Algorithms for likelihood computations on general pedigrees.
The long-term objective of the research is to develop an efficient,
extensible, modular and accessible software toolbox that would facilitate
statistical methods for analyzing complex pedigrees. The toolbox will
consist of novel algorithms that extend state-of-the-art algorithms from graph theory,
statistics, artificial intelligence, and genetics. The specific aim of
the grant is to develop an extensible software system for efficiently
computing pedigree likelihoods for complex diseases in the presence of
multiple DNA markers in fully general pedigrees, taking into account
qualitative and quantitative traits and a variety of disease models. This
type of computation is crucial in pedigree analysis both in humans
(e.g. for discovering disease genes) and in animals (e.g. for controlled
breeding of disease-resistant strains). Experience shows that by building
on experience gained within the last decade from the study of computational
probability, in particular from the theory of probabilistic networks, we can
construct a software system whose functionality, speed and extensibility
would be unmatched by current linkage software.
This research grant is a subcontract of a 3-year NIH grant to a colleague in the Computer Science Department of University of California Irvine. The above description is modified from the introduction to the parent UCI grant. Funding will start shortly, but is backdated to July 1, 2007, so an RA position is immediately available. The RA should have interests and strengths in algorithms for statistical and probabilistic computations in complex stochastic systems, including genetic systems, and sufficient computing expertise to interact with the core computer science components of the research.
PI: Elizabeth Thompson; one RA position immediately available.
VIGRE (NSF)
The Department of Applied Mathematics, Department of Mathematics, and
the Department of Statistics were awarded a second
Vertical Integration in Research
and Education (VIGRE) grant for the period 2004-2009. This grant is for
nearly $5 million over the next five years. The main thrust of our grant
is to enhance education and research by bringing together undergraduates,
graduate students, postdoctoral fellows, and faculty in cross-disciplinary
research settings in order to produce better mathematical scientists for
the future.
Current RAs: see the VIGRE grant page for further information.
Statistical Inverse Problems, Semiparametric
Models, and Empirical Processes (Funded by the NSF)
This project involves research on empirical process methods and
computational strategies for a variety of semiparametric and nonparametric models
for inverse and missing data problems. Currently work is underway on
competing risk models with current status observation schemes,
a type of model currently of interest in connection with HIV-AIDS studies.
In another direction, research is being carried out
on a new family of goodness-of-fit statistics and the associated confidence bands.
Recent past work on estimation under shape constraints has been
carried out jointly with Piet Groeneboom in the Netherlands and recent past
graduate students Fadoua Balabdaoui, Moulinath Banerjee, Leah Jager, Marloes
Maathuis, and Ying Zhang.
The research involves development of basic empirical process tools and methods, and applications of these new tools and methods to statistical problems concerning semiparametric models and inverse problems. Applications include panel count data, regression models for panel count data, bivariate interval censored data of several kinds, regression models for multivariate survival data, and studies of non- and semi-parametric maximum likelihood estimators used in AIDS research, and two-phase data dependent designs.
Funded by the NSF, July 2005 through June 2008.
PI: Jon A. Wellner , Statistics. Current RAs: Marios Pavlides and Arseni Seregin