Gomes, Ryan and Welling, Max and Perona, Pietro (2008) Memory bounded inference in topic models. In: ICML '08 Proceedings of the 25th international conference on Machine learning. ACM , New York, NY, pp. 344-351. ISBN 978-1-60558-205-4. https://resolver.caltech.edu/CaltechAUTHORS:20161026-173114406
![]() |
PDF
- Published Version
See Usage Policy. 317kB |
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20161026-173114406
Abstract
What type of algorithms and statistical techniques support learning from very large datasets over long stretches of time? We address this question through a memory bounded version of a variational EM algorithm that approximates inference in a topic model. The algorithm alternates two phases: "model building" and "model compression" in order to always satisfy a given memory constraint. The model building phase expands its internal representation (the number of topics) as more data arrives through Bayesian model selection. Compression is achieved by merging data-items in clumps and only caching their sufficient statistics. Empirically, the resulting algorithm is able to handle datasets that are orders of magnitude larger than the standard batch version.
Item Type: | Book Section | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| ||||||||||||
ORCID: |
| ||||||||||||
Additional Information: | Copyright 2008 by the author(s)/owner(s). We thank the anonymous reviewers for their helpful comments. This material is based on work supported by the National Science Foundation under grant numbers 0447903 and 0535278, the Office of Naval Research under grant numbers 00014-06-1-0734 and 00014-06-1-0795, and The National Institutes of Health Predoctoral Training in Integrative Neuroscience grant number T32 GM007737. | ||||||||||||
Funders: |
| ||||||||||||
DOI: | 10.1145/1390156.1390200 | ||||||||||||
Record Number: | CaltechAUTHORS:20161026-173114406 | ||||||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20161026-173114406 | ||||||||||||
Official Citation: | Ryan Gomes, Max Welling, and Pietro Perona. 2008. Memory bounded inference in topic models. In Proceedings of the 25th international conference on Machine learning (ICML '08). ACM, New York, NY, USA, 344-351. DOI=http://dx.doi.org/10.1145/1390156.1390200 | ||||||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||||||||
ID Code: | 71520 | ||||||||||||
Collection: | CaltechAUTHORS | ||||||||||||
Deposited By: | INVALID USER | ||||||||||||
Deposited On: | 27 Oct 2016 16:36 | ||||||||||||
Last Modified: | 11 Nov 2021 04:46 |
Repository Staff Only: item control page