Graph Clustering With Missing Data: Convex Algorithms and Analysis
Abstract
We consider the problem of finding clusters in an unweighted graph, when the graph is partially observed. We analyze two programs, one which works for dense graphs and one which works for both sparse and dense graphs, but requires some a priori knowledge of the total cluster size, that are based on the convex optimization approach for low-rank matrix recovery using nuclear norm minimization. For the commonly used Stochastic Block Model, we obtain explicit bounds on the parameters of the problem (size and sparsity of clusters, the amount of observed data) and the regularization parameter characterize the success and failure of the programs. We corroborate our theoretical findings through extensive simulations. We also run our algorithm on a real data set obtained from crowdsourcing an image classification task on the Amazon Mechanical Turk, and observe significant performance improvement over traditional methods such as k-means.
Additional Information
© 2014 Neural Information Processing Systems. The authors thank the anonymous reviewers for their insightful comments. This work was supported in part by the National Science Foundation under grants CCF-0729203, CNS-0932428 and CIF-1018927, by the Office of Naval Research under the MURI grant N00014-08-1-0747, and by a grant from Qualcomm Inc. The first author is also supported by the Schlumberger Foundation Faculty for the Future Program Grant.Attached Files
Published - 5309-graph-clustering-with-missing-data-convex-algorithms-and-analysis.pdf
Files
Name | Size | Download all |
---|---|---|
md5:e292668bb6b81a0f9b5ac3ae0399b90c
|
517.8 kB | Preview Download |
Additional details
- Eprint ID
- 65864
- Resolver ID
- CaltechAUTHORS:20160401-170949281
- NSF
- CCF-0729203
- NSF
- CNS-0932428
- NSF
- CIF-1018927
- Office of Naval Research (ONR)
- N00014-08-1-0747
- Qualcomm Inc.
- Schlumberger Foundation
- Created
-
2016-04-04Created from EPrint's datestamp field
- Updated
-
2019-10-03Created from EPrint's last_modified field
- Series Name
- Advances in Neural Information Processing Systems
- Series Volume or Issue Number
- 27