Oct 17

3:30 pm

## Bayesian model specification: toward a Theory of

### David Draper

Seminar

University of California Santa Cruz - Department of Applied Mathematics and Statistics

In this seminar I'll examine some issues in the foundations of probability and applied statistics, a topic that has surprising relevance to day-to-day work in those subjects. I'll begin with an axiomatization of the field of statistics based on three ingredients: theta, something unknown to you (this can be almost anything, but for concreteness think of a vector in R^k); a data set D relevant to decreasing your uncertainty about theta (this could also be almost anything, but think of a vector in R^n for concreteness), and a set {script B} of true/false propositions detailing your background assumptions and judgments describing {how the world works as far as theta, D and future data D^* are concerned}. Uncertainty is quantified in this axiomatization via a Bayesian version of probability, in which t the primitive is P( A | B ) for true-false propositions A and B; in this approach, densities of the form p ( theta | {script B } ) and p ( D | theta, {script B} ) arise naturally. A theorem, developed independently by the statistician/actuary de Finetti (1933) and the physicist RT Cox (1946), shows that there is one and only one way to quantify uncertainty in a logically internally-consistent manner; this method involves two ingredients for inference and prediction (the two densities mentioned above), and two more ingredients for decision-making (a set {script A} of possible actions, and a utility function U ( a, theta ) quantifying the value you would place on what would happen if you chose action a and the unknown were in fact theta). An interesting thing about the Cox/de Finetti theorem is that, after telling you that you have to pay attention to these four ingredients, the theorem is (almost entirely) silent about how to specify them. At present we have no progression, from principles through axioms to theorems, that characterizes optimal Bayesian model specification; instead we have an ad-hoc collection of methods, at least some of which seem more or less sensible. Thus -- while the foundations of probability (as it applies to statistics) seem quite solid -- the foundations of applied statistics do not seem at present to be secure; fixing this would yield a Theory of Applied Statistics, which we both need and do not yet have. In this talk I'll explore the extent to which four principles (Calibration, Modeling-As-Decision, Prediction, and Decision-Versus-Inference) constitute progress toward this goal.