 Maximum entropy course notes
 Adam Berger's MaxEnt resources Gentle tutorials, pointers to papers on Maxent in Language.
 S. Della Pietra, V. Della Pietra, and J. Lafferty. Inducing features of random fields. IEEE Transactions on pattern analysis and machine intelligence, 19(4), 380393, April, 1997
 T. Jaakkola, M. Meila, T. Jebara "Maximum Entropy Discrimination"
 Zhang Le's Maxent page Another list of tutorials and papers (Note that the links to papers are allrotten since they are through citeseer).
 Maximum Entropy Online Resources (mostly past MaxEnt conferences)
 Skilling, J. 1989. Classic maximum entropy. In: Maximum Entropy and Bayesian Methods. J. Skilling, editor. Kluwer Academic, Norwell, MA. 4552.
 The relation of Bayesian and Maximum Entropy methods", E.T. Jaynes, 1988
 J. Darroch and D. Ratcliff. Generalized iterative scaling for loglinear models. Ann. Math. Statistics, 43:14701480, 1972.
 A. Berger, S. Della Pietra, and V. Della Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):3971, 1996.
 S. Guiasu and A. Shenitzer. The principle of maximum entropy. The Mathematical Intelligencer, 7(1), 1985. (An overview paper)
 Boosting

On Boosting
 Y. Freund, "Boosting a weak learning algorithm by majority",
Information and Computation, 9:15451588, 1997.
 Y. Freund, and R. Schapire, "Experiments with a new boosting
algorithm", in Machine Learning: Proceedings of the Thirteenth International
Conference, pp. 148156, 1996.
 R. Schapire, Y. Freund, P. Bartlett, and W. S. Lee, "Boosting the margin:
a new explanation for the effectiveness of voting methods", in Machine Learning:
Proceedings of the Fourteenth Interbational Conference, 1997.
 J. Friedman, T. hastie, and R. Tibshirani,
"Additive logistic regression: a statistical view of boosting", Annals
of Statistics, 2000.
 H. Drucker, and C. Cortes, "Boosting decision trees",
in Advances in Neural Information Processing Systems 8, pp. 479485, 1996.

On Error Bounds, and combining classifiers

V. Koltchinskii, D. Panchenko, "Some new bounds on the generalization error of
combined classifiers", NIPS 2000, pp. 245251.
 A. Murua, "Upper bounds for error rates associated to linear combination
of classifiers", IEEE PAMI, May 2002.
 E. Bauer, and R. Kohavi, "An empirical comparison of voting classification
algorithms: bagging, boosting, and variants", Machine Learning, 36, 105142, 1999.

On Bagging, Arcing, and Random Forests
 Y. Amit, and A. Murua, "Speech recognition using randomized relational decision trees",
IEEE, Trans. Speech and Audio Processing, 9, May 2001.
 Y. Amit, and D. Geman, "Shape quantization and recognition with randomized trees",
Neural Computation 9, pp. 15451588, 1997.
 L. Breiman, "Bagging predictors", Machine Learning, 24(2):123140, 1996.
 L. Breiman, "RF/TOOLS: A Class of Twoeyed Algorithms",
SIAM Workshop, May 2003, Statistics Department, UCB.
 L. Breiman, "Random Forests", 2001, Statistics Department, UCB.
 L. Breiman, "Prediction games and arcing algorithms",
Statistics Department, UCB.
 Dimension reduction
 EM Algorithms for PCA and SPCA.Sam Roweis.
 Experiments with random projection, Sanjoy Dasgupta,
Uncertainty in Artificial Intelligence (UAI), 2000.
 Bayesian Multidimensional Scaling and Choice of Dimension Oh, M.S. and Raftery, A.E., Journal of the American Statistical Association, 96, 10311044.(2001)
 Bayesian PCA. Bishop, C. M. In M. S. Kearns, S. A. Solla, and D. A. Cohn (Eds.), Advances in Neural Information Processing Systems, Volume 11, pp. 382388. MIT Press.
 Locally Linear Embedding (LLE) homepage Read "An introduction to Locally Linear Embedding" in the Publications page
 Hessian Eigenmaps: New LocallyLinear Embedding Techniques for HighDimensional Data Carrie Grimes, David Donoho
