Gao, Jie and Langberg, Michael and Schulman, Leonard J. (2008) Analysis of Incomplete Data and an IntrinsicDimension Helly Theorem. Discrete and Computational Geometry, 40 (4). pp. 537560. ISSN 01795376. http://resolver.caltech.edu/CaltechAUTHORS:GAOdcg08

PDF
 Submitted Version
See Usage Policy. 174Kb 
Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:GAOdcg08
Abstract
The analysis of incomplete data is a longstanding challenge in practical statistics. When, as is typical, data objects are represented by points in R^d , incomplete data objects correspond to affine subspaces (lines or Δflats).With this motivation we study the problem of finding the minimum intersection radius r(L) of a set of lines or Δflats L: the least r such that there is a ball of radius r intersecting every flat in L. Known algorithms for finding the minimum enclosing ball for a point set (or clustering by several balls) do not easily extend to higher dimensional flats, primarily because “distances” between flats do not satisfy the triangle inequality. In this paper we show how to restore geometry (i.e., a substitute for the triangle inequality) to the problem, through a new analog of Helly’s theorem. This “intrinsicdimension” Helly theorem states: for any family L of Δdimensional convex sets in a Hilbert space, there exist Δ + 2 sets L' ⊆ L such that r(L) ≤ 2r(L'). Based upon this we present an algorithm that computes a (1+ε)core set L' ⊆ L, L' = O(Δ^4/ε), such that the ball centered at a point c with radius (1 +ε)r(L') intersects every element of L. The running time of the algorithm is O(n^(Δ+1)dpoly(Δ/ε)). For the case of lines or line segments (Δ = 1), the (expected) running time of the algorithm can be improved to O(ndpoly(1/ε)).We note that the size of the core set depends only on the dimension of the input objects and is independent of the input size n and the dimension d of the ambient space.
Item Type:  Article  

Related URLs: 
 
ORCID: 
 
Additional Information:  © 2008 Springer Science. Received: 12 April 2007/Revised: 12 June 2008/Accepted: 7 July 2008/Published online: 23 September 2008. Work was done when J. Gao was with Center for the Mathematics of Information, California Institute of Technology. Work was done when M. Langberg was a postdoctoral scholar at the California Institute of Technology. Research supported in part by NSF grant CCF0346991. Research of L.J. Schulman supported in part by an NSF ITR and the Okawa Foundation.  
Funders: 
 
Subject Keywords:  Clustering; kcenter; Core set; Incomplete data; Helly theorem; Approximation; Inference.  
Issue or Number:  4  
Record Number:  CaltechAUTHORS:GAOdcg08  
Persistent URL:  http://resolver.caltech.edu/CaltechAUTHORS:GAOdcg08  
Official Citation:  Gao, J., Langberg, M. & Schulman, L.J. Discrete Comput Geom (2008) 40: 537. https://doi.org/10.1007/s0045400891075  
Usage Policy:  No commercial reproduction, distribution, display or performance rights in this work are provided.  
ID Code:  13555  
Collection:  CaltechAUTHORS  
Deposited By:  Ruth Sustaita  
Deposited On:  17 Jun 2009 19:03  
Last Modified:  06 Sep 2019 22:21 
Repository Staff Only: item control page