A Caltech Library Service

Geometric structure of graph Laplacian embeddings

García Trillos, Nicolás and Hoffmann, Franca and Hosseini, Bamdad (2021) Geometric structure of graph Laplacian embeddings. Journal of Machine Learning Research, 22 . pp. 1-55. ISSN 1533-7928.

[img] PDF - Published Version
Creative Commons Attribution.

[img] PDF - Submitted Version
See Usage Policy.


Use this Persistent URL to link to this item:


We analyze the spectral clustering procedure for identifying coarse structure in a data set x₁,…,x_n, and in particular study the geometry of graph Laplacian embeddings which form the basis for spectral clustering algorithms. More precisely, we assume that the data are sampled from a mixture model supported on a manifold M embedded in Rd, and pick a connectivity length-scale ε>0 to construct a kernelized graph Laplacian. We introduce a notion of a well-separated mixture model which only depends on the model itself, and prove that when the model is well separated, with high probability the embedded data set concentrates on cones that are centered around orthogonal vectors. Our results are meaningful in the regime where ε=ε(n) is allowed to decay to zero at a slow enough rate as the number of data points grows. This rate depends on the intrinsic dimension of the manifold on which the data is supported.

Item Type:Article
Related URLs:
URLURL TypeDescription Paper
García Trillos, Nicolás0000-0002-7711-5901
Hoffmann, Franca0000-0002-1182-5521
Additional Information:© 2021 Nicolás García Trillos, Franca Hoffmann, Bamdad Hosseini. License: CC-BY 4.0, see Submitted 8/19; Revised 11/20; Published 3/21. The authors would like to thank Ulrike von Luxburg for pointing them to the paper Schiebinger et al. (2015) which was the starting point of this work. NGT was supported by NSF grant DMS 1912802. FH was partially supported by Caltech's von Karman postdoctoral instructorship. BH is supported in part by a postdoctoral fellowship granted by Natural Sciences and Engineering Research Council of Canada.
Funding AgencyGrant Number
Natural Sciences and Engineering Research Council of Canada (NSERC)UNSPECIFIED
Subject Keywords:Unsupervised learning, spectral clustering, graph Laplacian, mixture models, continuum limit
Record Number:CaltechAUTHORS:20200331-074327697
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:102184
Deposited By: Tony Diaz
Deposited On:31 Mar 2020 16:09
Last Modified:06 Jul 2021 17:57

Repository Staff Only: item control page