CaltechAUTHORS
  A Caltech Library Service

The Effectiveness of Lloyd-Type Methods for the k-Means Problem

Ostrovsky, Rafail and Rabani, Yuval and Schulman, Leonard J. and Swamy, Chaitanya (2006) The Effectiveness of Lloyd-Type Methods for the k-Means Problem. In: 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06). IEEE , Piscataway, NJ, pp. 165-176. ISBN 0-7695-2720-5. https://resolver.caltech.edu/CaltechAUTHORS:20170511-131811663

[img] PDF - Published Version
See Usage Policy.

264kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20170511-131811663

Abstract

We investigate variants of Lloyd's heuristic for clustering high dimensional data in an attempt to explain its popularity (a half century after its introduction) among practitioners, and in order to suggest improvements in its application. We propose and justify a clusterability criterion for data sets. We present variants of Lloyd's heuristic that quickly lead to provably near-optimal clustering solutions when applied to well-clusterable instances. This is the first performance guarantee for a variant of Lloyd's heuristic. The provision of a guarantee on output quality does not come at the expense of speed: some of our algorithms are candidates for being faster in practice than currently used variants of Lloyd's method. In addition, our other algorithms are faster on well-clusterable instances than recently proposed approximation algorithms, while maintaining similar guarantees on clustering quality. Our main algorithmic contribution is a novel probabilistic seeding process for the starting configuration of a Lloyd-type iteration.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
https://doi.org/10.1109/FOCS.2006.75DOIPaper
http://ieeexplore.ieee.org/document/4031353/PublisherPaper
ORCID:
AuthorORCID
Schulman, Leonard J.0000-0001-9901-2797
Additional Information:© 2006 IEEE. Supported in part by IBM Faculty Award, Xerox Innovation Group Award, a gift from Teradata, Intel equipment grant, and NSF Cybertrust grant no. 0430254. Supported in part by ISF 52/03, BSF 2002282, and the Fund for the Promotion of Research at the Technion. Supported in part by NSF CCF-0515342, NSA H98230-06-1-0074, and NSF ITR CCR-0326554.
Funders:
Funding AgencyGrant Number
IBMUNSPECIFIED
XeroxUNSPECIFIED
TeradataUNSPECIFIED
IntelUNSPECIFIED
NSFCNS-0430254
Israel Science Foundation52/03
Binational Science Foundation (USA-Israel)2002282
TechnionUNSPECIFIED
NSFCCF-0515342
National Security AgencyH98230-06-1-0074
NSFCCR-0326554
DOI:10.1109/FOCS.2006.75
Record Number:CaltechAUTHORS:20170511-131811663
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20170511-131811663
Official Citation:R. Ostrovsky, Y. Rabani, L. J. Schulman and C. Swamy, "The Effectiveness of Lloyd-Type Methods for the k-Means Problem," 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06), Berkeley, CA, 2006, pp. 165-176. doi: 10.1109/FOCS.2006.75
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:77376
Collection:CaltechAUTHORS
Deposited By:INVALID USER
Deposited On:12 May 2017 23:43
Last Modified:15 Nov 2021 17:30

Repository Staff Only: item control page