MGH

Building Map

Optimal and Fast Detection of Spatial Clusters with Scan Statistics

Start Time
Speaker
Guenther Walther

Scan statistics are a common tool to detect e.g. spatial disease clusters or to describe local differences between two distributions. Multivariate scan statistics pose both a statistical problem due to the multiple testing over many scan windows, as well as a computational problem because statistics have to be evaluated on many windows. I will describe methodology that leads to both statistically optimal inference and computationally efficient algorithms.

Building
Room
389

Tail Risk Budgeting

Start Time
Speaker
R. Douglas Martin

Risk budgeting is a methodology that has become increasingly popular over the last decade as a relatively transparent alternative to rebalancing portfolios via a black-box portfolio optimization method. We begin by briefly reviewing “classical” risk budgeting methodology based on volatility (standard deviation) of returns as the risk measure.

Building
Room
389

Approximate Bayesian Computation Under Model Uncertainty, with Application to Protein Network Evolution

Start Time
Speaker
Sylvia Richardson

Data-generating stochastic processes arise naturally in many disciplines, for example biology, ecology or epidemiology. In many cases, because interesting models are highly complex, the likelihood f(xo | θ, M) of such implicit scientific models M is intractable. This hampers scientific progress in terms of iterative data acquisition, parameter inference, model checking and model refinement within a Bayesian framework. Nevertheless, given a value of θ, it is usually possible to simulate data from f(.|θ, M).

Building
Room
389

Pragmatic Bayesian Designs For Clinical Trials

Start Time
Speaker
Lurdes Y.T. Inoue

In this talk we discuss the application of Bayesian methods in the design of clinical trials. In the first part of the talk we discuss sample size determination. A broad range of frequentist and Bayesian methods for sample size determination can be described as choosing the smallest sample that is sufficient to achieve some set of goals. An example for the frequentist is seeking the smallest sample size that is sufficient to achieve a desired power at a specified significance level.

Building
Room
389

Stochastic Models That Separate Fractal Dimension and Hurst Effect

Start Time
Speaker
Tilmann Gneiting ARTICLE

Fractal behavior and long-range dependence have been described in an astonishing number of physical, biological, geological, and socio-economic systems. Time series, profiles, and surfaces have been characterized by their fractal dimension, a measure of roughness, and by the Hurst coefficient, a measure of long-memory dependence. Either phenomenon has been modeled and explained by self-similar random functions, such as fractional Gaussian noise and fractional Brownian motion.

Building
Room
389

Bayesian Analysis of Multi-Model Ensembles for Assessing Uncertainty in Climate Change Projections

Start Time
Speaker
Claudia Tebaldi

Different General Circulation Models (GCMs) produce different climate change projections, especially when evaluated at subcontinental (regional) scales. When it is time to try and combine their responses into a summary measure, and relative uncertainty bounds, it makes sense to weigh more the output of those GCMs that show better performance in reproducing present day climate (i.e. have smaller bias) and that agree with the majority (i.e. do not seem like outliers).

Building
Room
389

Considerations and Approaches Regarding the Deconvolution of an Unknown Function of One Variable From a Finite Set of Measurements

Start Time
Speaker
Brad Bell

Deconvolution of an unknown function of one variable from a finite set of measurements is an ill-posed problem. Placing a Bayesian prior on a function space is one way to extend the scientific model and obtain a well-posed problem. This problem can be well-posed even if the relationship between the unknown function and the measurements, as well as the function space prior, has unknown parameters. We present a method for estimating the unknown parameters by maximizing an approximation of the marginal likelihood where the unknown function has been integrated out.

Building
Room
389

What is the \'True Price\'? - State Space Models for High Frequency Financial Data

Start Time
Speaker
John Moody

Tick-by-tick interbank foreign exchange (FX) price series exhibit statistically- significant structures on various time scales. These include negative autocorrelations in tick-by-tick returns and positive autocorrelations (trends) on longer time scales. To account for the observed structures, we propose state space models for financial time series in which the observed price is a noisy version of an unobserved, less-noisy ``True Price\'\' process.

Building
Room
389

Recidivism and Social Interactions

Start Time
Speaker
Sibel Sirakaya

Faced with overcrowded prisons, the courts have been increasingly passing probation sentences for adults convicted of felony crimes. Using a national sample, this paper identifies the risk factors for recidivism among Female, Male, Black, White and Hispanic felony probationers. Individual hazard function is assumed to depend on individual and neighborhood characteristics as well as social interactions among probationers. In selecting the covariates from a set of potential candidates, Bayesian model averaging is used both to account for model uncertainty and the subsequent inference.

Building
Room
389

Statistical Estimation From an Optimization Viewpoint

Start Time
Speaker
Lisa Korf

This lecture focuses on problems of density estimation (both parametric and nonparametric) and if there is time, time series estimation (no pun intended). When formulated as optimization problems, consistency of the estimators becomes a question of whether a sequence of optimization problems converge in an appropriate sense to the true problem. The tools of variational analysis are used to examine the question of consistency for these problems. In particular, an epigraphical ergodic theorem can be used to show consistency for a broad class of estimation problems.

Building
Room
389

(Bayesian) Statistics with Rankings

Start Time
Speaker
Marina Meila

People often express their preferences for web pages, products, candidates in an election as a ranked list. Ranked lists are also the standard output of search engines like Google or Sequest. The interest of this talk is to show how one can do \"statistics as usual\" with this kind of discrete, structured, high-dimensional data.

I will define statistical models over spaces of permutations and partial orderings, and present methods for estimating these models from data.

Building
Room
389

A General Overview of Microsoft Treasury - From Investing $65 Billion in Assets to Managing the Associated Risks

Start Time
Speaker
George Zinn

Computational Finance Seminar As Corporate Vice President and Treasurer, George Zinn is responsible for overseeing Microsoft's corporate assets. He leads a group which manages the company's worldwide financial and corporate risk, investment portfolio, strategic portfolio, foreign exchange, corporate and structured project finance, dilution management, cash and liquidity, customer financing, and credit activities. In addition, Treasury has an important role in a range of initiatives across the spectrum from compensation through acquisitions and IP licensing.

Building
Room
295

Genetic Architecture and Evolution of Gene Expression Variation: Insights From Yeast and Humans

Start Time
Speaker
Joshua Akey

Gene expression is an important molecular phenotype, providing the initial step in bridging the divide between static genomic information and dynamic organismal phenotypes. Thus, variation in gene expression levels is thought to constitute a significant source of phenotypic diversity among individuals within populations and to contribute to the evolutionary divergence between species. I will discuss our work on identifying regulatory polymorphisms that contribute to heritable transcriptional variation in both yeast and humans.

Building
Room
389

A Primer on Mass Spectrometry Based Proteomics

Start Time
Speaker
Dave Goodlett

I'll review the basics of peptide/protein chemistry pertinent to sequencing by MS and discuss how the MS instruments produce spectra which are "converted" to sequence by software as well as some about the software. So, 1/3 each of 1) protein chemistry (and the why of how we do proteomics with MS), 2) a description of fragmentation mechanisms (this is what people casually refer to as sequencing) and 3) the vagaries of finally sequence assignment to the raw data.

Building
Room
389

Model-Based Clustering for Online Crisis Identification in Distributed Computing

Start Time
Speaker
Dawn Woodard

Large-scale distributed computing systems can suffer from occasional severe violation of performance goals; due to the complexity of these systems, manual diagnosis of the cause of the crisis is too slow to inform interventions taken during the crisis. Rapid automatic recognition of the recurrence of a problem can lead to cause diagnosis and informed intervention. We frame this as an online clustering problem, where the labels (causes) of some of the previous crises may be known.

Building
Room
389

Recommender Systems for Fun and Profit

Start Time
Speaker
Christopher T. Volinsky

In October 2006, Netflix kicked off a $1M competition by releasing 100 million movie ratings as a training set to be used to build a better recommendation system for their on-line movie rental business. This landmark data set generated intense interest from the statistics and machine learning communities, and attracted entries from over 3000 teams from academia and industry. In this talk, I will review our team\'s experience analyzing this data using a collection of data mining techniques and document our journey towards winning a share of the million dollar prize.

Building
Room
389

Seeking a Predictive Theory for Adaptive Evolution

Start Time
Speaker
Paul Joyce

The primary impediment to formulating a general theory for adaptive evolution has been the unknown distribution of fitness effects for new beneficial mutations. By applying extreme value theory, Gillespie (1984) circumvented this issue in his mutational landscape model for the adaptation of DNA sequences and Orr (2002) extended Gillespie\'s model, generating testable predictions regarding the course of adaptive evolution. Rokyta et.al. (2005) provided the first empirical examination of this model, using an ssDNA bacteriophage.

Building
Room
389