University of New Mexico - Department of Mathematics & Statistics
We will consider a new class of probabilistic upper bounds on generalization error of complex classifiers that are "combinations" of simpler classifiers from a base class of functions. Such combinations can be implemented by neural networks or by voting methods of combining the classifiers, such as boosting and bagging. The resulting combined classifiers often have a large classification margin. The bounds on the generalization error are expressed in such cases in terms of the empirical distribution of the margins of the combined classifier. These bounds originated in computer science literature, but they are essentially based on the methods of the theory of Gaussian and empirical processes (comparison inequalities, entropy bounds, symmetrization and concentration inequalities). They provide a partial explanation of the effectiveness of AdaBoost and other large margin classification techniques.