Oct 26

3:30 pm

## The Shannon Estimator: Consistent Estimation of Markov Order

### David Donoho

Seminar

Stanford University

In this talk, I will discuss relationships between data compression and statistical estimation. There has been considerable literature on the connection of these concepts, which I will review. In this talk I will show that if we take Shannon seriously and do what he prescribed as ideal data compression, we obtain a very interesting kind of estimator. A simple example is as follows. Suppose we have a vector of n data with yi = Ã˜i + zi, with the zi iid N(0,1). Suppose the Ã˜i are themselves iid according to common distribution F, and apply Shannon's rate-distortion theory to understand the best possible compression of the vector {yi}^(n)1 to within a mean square discrepancy 1. The ideal shannon compression of (yi) gives us, after decompression, a new vector, (t_i). This vector can be interpreted as an estimator of (Ã˜i). It is not as good as the Bayesian estimator in this setting, but close: it achieves exactly twice the mean-squared error.

Using this approach, we get a variety of new results of an unexpected form. For example, consider optimal shannon-style compression of noisy realizations from certain Gaussian stochastic processes, with reconstruction fidelity equal to the noise level. Viewed this as an estimator; you get asymptotically twice the Bayes risk of Wiener filtering. Consider adaptive shannon compression (in a sense to be defined) of noisy data arising from certain function classes, again with reconstruction fidelity equal to the noise level. View this as an estimator; you get asymptotically twice the minimax risk.