Papers on machine learning

Estimating the cluster tree of a density by analyzing the minimal spanning tree of a sample.
Journal of Classification, Vol. 20, No. 5, 2003, pp. 25-47.
Abstract   PDF file  R code and a DLL implementing Runt Pruning (the clustering method described in the paper above) can be found  here.

Hierarchical model-based clustering of large datasets through Fractionation and Refractionation.
Joint work with Jeremy Tantrum and Alejandro Murua. Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining (KDD02), 2002, pp. 183--190.
 Abstract   PDF file

Assessment and pruning of hierarchical model-based clustering.
Joint work with Jeremy Tantrum and Alejandro Murua. Proceedings of the 9th International Conference on Knowledge Discovery and Data Mining (KDD03), 2003, pp. 197 -- 205.
Abstract   PDF file

Observations on Bagging.
Joint work with Andreas Buja. Statistica Sinica, Vol. 16, No. 2, 2006, pp. 323--352.  Abstract   PDF file

Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications
Joint work with Andreas Buja and Yi Shen. Abstract   PDF file

On Potts Model Clustering, Kernel K-means, and Density Estimation
Joint work with Alejandro Murua and Larissa Stanberry.  Abstract  PDF file

A Generalized Single Linkage Method for Estimating the Cluster Cluster Tree of a Density
Joint work with Rebecca Nugent   Abstract   PDF file

 

 

 

 

Talks on machine learning

Unsupervised learning: Estimating the cluster tree of a density from the minimal spanning tree of a sample. Powerpoint presentation

Unsupervised learning: Statistical and computational perspectives. Powerpoint presentation

What are the effects of "Bagging"? Some experimental and theoretical results. Powerpoint presentation

Generalized single linkage clustering.  Powerpoint presentation