A Caltech Library Service

Data reduction for weighted and outlier-resistant clustering

Feldman, Dan and Schulman, Leonard J. (2012) Data reduction for weighted and outlier-resistant clustering. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics , New York, NY, pp. 1343-1354. ISBN 9781611972108.

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item:


Statistical data frequently includes outliers; these can distort the results of estimation procedures and optimization problems. For this reason, loss functions which deemphasize the effect of outliers are widely used by statisticians. However, there are relatively few algorithmic results about clustering with outliers. For instance, the k-median with outliers problem uses a loss function fc_1,...,c_k(x) which is equal to the minimum of a penalty h, and the least distance between the data point x and a center c_i. The loss-minimizing choice of {c_1,..., c_k} is an outlier-resistant clustering of the data. This problem is also a natural special case of the k-median with penalties problem considered by [Charikar, Khuller, Mount and Narasimhan SODA'01]. The essential challenge that arises in these optimization problems is data reduction for the weighted k-median problem. We solve this problem, which was previously solved only in one dimension ([Har-Peled FSTTCS'06], [Feldman, Fiat and Sharir FOCS'06]). As a corollary, we also achieve improved data reduction for the k-line-median problem.

Item Type:Book Section
Related URLs:
URLURL TypeDescription
Schulman, Leonard J.0000-0001-9901-2797
Additional Information:© 2012 SIAM. Work supported in part by the NSF.
Funding AgencyGrant Number
Record Number:CaltechAUTHORS:20120524-112316051
Persistent URL:
Official Citation:Data reduction for weighted and outlier-resistant clustering Dan Feldman, Leonard J. Schulman Pages: 1343-1354
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:31632
Deposited By: Ruth Sustaita
Deposited On:24 May 2012 20:04
Last Modified:09 Mar 2020 13:19

Repository Staff Only: item control page