Georgia Institute of Technology - School of Mathematics
We will consider a problem of learning a target function based on its noisy observations at random points via penalized empirical risk minimization with convex loss function and convex complexity penalty. Depending on the loss function, this framework includes various regression problems as well as large margin classification (such as boosting and SVM). It is assumed that the target function belongs to the linear span or to the convex hull of a very large dictionary and that it has a "sparse representation" in the "dictionary". The goal is to recover the function with an error that depends on the "sparsity" of the problem (rather than on the cardinality of the dictionary).
We study an approach based on l1 complexity penalization in the case of linear span and on entropy penalization in the case of convex hull. We establish several probabilistic inequalities showing the relationship between the sparsity of the empirical solution and the sparsity of the target function and provide bounds on the excess risk and the L 2 - error that depend on the sparsity of the problem.