CaltechAUTHORS
  A Caltech Library Service

Clustering affine subspaces: hardness and algorithms

Lee, Euiwoong and Schulman, Leonard J. (2013) Clustering affine subspaces: hardness and algorithms. In: Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM , Philadelphia, PA, pp. 810-827. ISBN 978-1-61197-251-1. https://resolver.caltech.edu/CaltechAUTHORS:20161121-172931655

[img] PDF - Published Version
See Usage Policy.

500kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20161121-172931655

Abstract

We study a generalization of the famous k-center problem where each object is an affine subspace of dimension Δ, and give either the first or significantly improved algorithms and hardness results for many combinations of parameters. This generalization from points (Δ = 0) is motivated by the analysis of incomplete data, a pervasive challenge in statistics: incomplete data objects in ℝd can be modeled as affine subspaces. We give three algorithmic results for different values of k, under the assumption that all subspaces are axis-parallel, the main case of interest because of the correspondence to missing entries in data tables. 1) k = 1: Two polynomial time approximation schemes which runs in poly (Δ, 1/∊)nd. 2) k = 2: O(Δ1/4)-approximation algorithm which runs in poly(n, d, Δ) 3) General k: Polynomial time approximation scheme which runs in We also prove nearly matching hardness results; in both the general (not necessarily axis-parallel) case (for k ≥ 2) and in the axis-parallel case (for k ≥ 3), the running time of an approximation algorithm with any approximation ratio cannot be polynomial in even one of k and Δ, unless P = NP. Furthermore, assuming that the 3-SAT problem cannot be solved sub-exponentially, the dependence on both k and Δ must be exponential in the general case (in the axis-parallel case, only the dependence on k drops to . The simplicity of the first and the third algorithm suggests that they might be actually used in statistical applications. The second algorithm, which demonstrates a theoretical gap between the axis-parallel and general case for k = 2, displays a strong connection between geometric clustering and classical coloring problems on graphs and hypergraphs, via a new Helly-type theorem.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
http://dx.doi.org/http://dx.doi.org/10.1137/1.9781611973105.58DOIPaper
http://epubs.siam.org/doi/abs/10.1137/1.9781611973105.58PublisherPaper
ORCID:
AuthorORCID
Schulman, Leonard J.0000-0001-9901-2797
Additional Information:© 2013 SIAM. Supported in part by the Samsung Foundation. Supported in part by the NSF.
Funders:
Funding AgencyGrant Number
Samsung FoundationUNSPECIFIED
NSFUNSPECIFIED
DOI:10.1137/1.9781611973105.58
Record Number:CaltechAUTHORS:20161121-172931655
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20161121-172931655
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:72226
Collection:CaltechAUTHORS
Deposited By: Kristin Buxton
Deposited On:22 Nov 2016 03:19
Last Modified:11 Nov 2021 04:58

Repository Staff Only: item control page