Nov 21

1:30 pm

## Manifold Learning Using Kernel Density Estimation and Local PCA

### Kitty Mohammed

General Exam

High-dimensional datasets often have lower-dimensional structure, which frequently takes the form of a manifold. There are many algorithms (e.g., Isomap) that are used in practice to fit manifolds and thus reduce the dimensionality of a given dataset. In our work, we consider the problem of recovering a d-dimensional submanifold M of R^n when provided with noiseless samples from M. Ideally, the estimate M_hat of M should be an actual manifold. Generally speaking, existing manifold learning algorithms do not meet these criteria. Fefferman, Mitter, and Narayanan (2013) have developed an algorithm whose output is provably a manifold. Additionally, when we have noiseless data from M, M_hat is close in empirical risk to M. The key idea is to define an approximate squared-distance function (asdf) to M that can be calculated from the data. As long as this function meets certain regularity conditions, M_hat is given by the set of points where the gradient of the asdf is orthogonal to the subspace spanned by the largest n - d eigenvectors of the Hessian of the asdf. In this talk, we present work showing that kernel density estimation can be used to construct an asdf that meets the required regularity conditions. We will also briefly discuss work in which we use a collection of tangent spaces estimated with local PCA to define another asdf.