We explore and exploit the use of differential operators on manifolds - the Laplace-Beltrami operator in particular - in learning tasks. In particular, we are interested in uncovering the geometric structure of data(unsupervised learning) and in exploiting information contained in unlabeled data for regression and classification tasks (semi-supervised learning).
First, building on the Laplacian Eigenmap and Diffusion Maps framework, we propose a new paradigm that offers a guarantee, under reasonable assumptions, that any manifold learning algorithm will preserve the geometry of a data set. Our approach is based on augmenting the output of embedding algorithms with geometric information embodied in the Riemannian metric of the manifold. We provide an algorithm for estimating the Riemannian metric from data, consider its consistency, and demonstrate the advantages of our approach in a variety of examples.
Second, we extend the idea of learning the geometry of the data to improve the performance of prediction tasks. From a statistical point of view, this means dealing with data that are locally collinear, but where the global relationship between the covariates is non-linear. We do this by combining the MatÃ©rn Gaussian process - a flexible and easily interpretable Bayesian non-parametric regression model - with the Laplace-Beltrami operator, which embodies all the intrinsic geometry of the manifold. This yields a principled geometrical approach for learning tasks on the intrinsic geometry of the manifold.
Finally, we turn to the problem of setting hyperparameters used to construct the graph Laplacian, the highly prevalent non-parametric estimator of the Laplace-Beltrami operator. Speci fically, we study the problem of setting epsilon, the kernel bandwidth, to construct the graph Laplacian for Euclidean data - a parameter that, according to our results, has a material impact in both the unsupervised and semisupervised learning context. We exploit the connection between manifold geometry and the Laplace-Beltrami operator so as to obtain the hyperparameters for which the graph Laplacian best encodes the geometry of the data.