CaltechAUTHORS
A Caltech Library Service

# Analysis of incomplete data and an intrinsic-dimension Helly theorem

Gao, Jie and Langberg, Michael and Schulman, Leonard J. (2006) Analysis of incomplete data and an intrinsic-dimension Helly theorem. In: SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm. SIAM , Philadelphia, PA, pp. 464-473. ISBN 0-89871-605-5. https://resolver.caltech.edu/CaltechAUTHORS:20161025-171347433 PDF - Published Version See Usage Policy. 248kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20161025-171347433

## Abstract

The analysis of incomplete data is a long-standing challenge in practical statistics. When, as is typical, data objects are represented by points in ℝ^d, incomplete data objects correspond to affine subspaces (lines or Δ-flats). With this motivation we study the problem of finding the minimum intersection radius r(L) of a set of lines or Δ-flats L: the least r such that there is a ball of radius r intersecting every flat in L. Known algorithms for finding the minimum enclosing ball for a point set (or clustering by several balls) do not easily extend to higher-dimensional flats, primarily because "distances" between flats do not satisfy the triangle inequality. In this paper we show how to restore geometry (i.e., a substitute for the triangle inequality) to the problem, through a new analog of Helly's theorem. This "intrinsic-dimension" Helly theorem states: for any family L of Δ-dimensional convex sets in a Hilbert space, there exist Δ + 2 sets L' ⊆ L such that r(L) ≤ 2r(L'). Based upon this we present an algorithm that computes a (1 + ε)-core set L' ⊆ L,|L'| = O(Δ^4/ε^2), such that the ball centered at a point c with radius (1 + ε)r(L') intersects every element of L. The running time of the algorithm is O(nΔ+1dpoly(1/ε)). For the case of lines or line segments (Δ = 1), the (expected) running time of the algorithm can be improved to O(nd poly(1/ε)). We note that the size of the core set depends only on the dimension of the input objects and is independent of the input size n and the dimension d of the ambient space.

Item Type:Book Section
Related URLs:
URLURL TypeDescription
http://dl.acm.org/citation.cfm?id=1109609PublisherArticle
ORCID:
AuthorORCID
Langberg, Michael0000-0002-7470-0718
Schulman, Leonard J.0000-0001-9901-2797
Additional Information:© 2006 SIAM. M. Langberg is supported in part by NSF grant CCF-0346991. L. Schulman is supported in part by an NSF ITR and the Okawa Foundation.
Funders:
Funding AgencyGrant Number
NSFCCF-0346991
Okawa FoundationUNSPECIFIED
Record Number:CaltechAUTHORS:20161025-171347433
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20161025-171347433
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:71486
Collection:CaltechAUTHORS
Deposited By: Kristin Buxton
Deposited On:26 Oct 2016 18:50