A Caltech Library Service

Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

Schäfer, Florian and Sullivan, T. J. and Owhadi, Houman (2021) Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity. Multiscale Modeling and Simulation, 19 (2). pp. 688-730. ISSN 1540-3459. doi:10.1137/19M129526X.

[img] PDF - Published Version
See Usage Policy.

[img] PDF - Submitted Version
See Usage Policy.


Use this Persistent URL to link to this item:


Dense kernel matrices Θ∈R^(N×N) obtained from point evaluations of a covariance function G at locations {x_i}_(1≤i≤N)⊂ ℝ^d arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously distributed sampling points, we show how to identify a subset S⊂{1,…,N}², with #S=O(N log(N)log^d(N/ϵ)), such that the zero fill-in incomplete Cholesky factorization of the sparse matrix Θ_(i,j)1_((i,j)∈S) is an ϵ-approximation of Θ. This factorization can provably be obtained in complexity O(N log(N) log^d(N/ϵ)) in space and O(N log²(N) log^(2d))(N/ϵ)) in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that d can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the x_i and does not require an analytic representation of G. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion, and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity O (N log^d(N/ϵ)) in space and O (N log^(2d)(N/ϵ)) in time, improving upon the state of the art for general elliptic operators.

Item Type:Article
Related URLs:
URLURL TypeDescription Paper
Owhadi, Houman0000-0002-5677-1600
Additional Information:© 2021 Florian Schäfer, T. J. Sullivan, Houman Owhadi. Received by the editors October 24, 2019; accepted for publication (in revised form) October 9, 2020; published electronically April 15, 2021. The first and third authors gratefully acknowledge support by the Air Force Office of Scientific Research and the DARPA EQUiPS Program (award FA9550-16-1-0054, Computational Information Games) and the Air Force Office of Scientific Research (award FA9550-18-1-0271, Games for Computation and Learning). The second author has been supported by the Freie Universitat Berlin within the Excellence Initiative of the German Research Foundation. This collaboration has been facilitated by the Statistical and Applied Mathematical Sciences Institute through the National Science Foundation award DMS-1127914. We would like to thank C. Oates and P. Schröder for helpful discussions, and C. Scovel for many helpful comments and suggestions on an earlier version of this paper (June 2017, arXiv:1706.02205). We would like to thank the two anonymous referees for their constructive feedback, which helped to improve the paper.
Funding AgencyGrant Number
Air Force Office of Scientific Research (AFOSR)FA9550-16-1-0054
Air Force Office of Scientific Research (AFOSR)FA9550-18-1-0271
Deutsche Forschungsgemeinschaft (DFG)UNSPECIFIED
Issue or Number:2
Classification Code:AMS subject classifications: 65F30, 42C40, 65F50, 65N55, 65N75, 60G42, 68Q25, 68W40
Record Number:CaltechAUTHORS:20170710-083743909
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:78882
Deposited By: Ruth Sustaita
Deposited On:10 Jul 2017 16:52
Last Modified:20 Aug 2021 17:13

Repository Staff Only: item control page