CaltechAUTHORS
  A Caltech Library Service

On Thompson Sampling with Langevin Algorithms

Mazumdar, Eric and Pacchiano, Aldo and Ma, Yi-an and Bartlett, Peter L. and Jordan, Michael I. (2020) On Thompson Sampling with Langevin Algorithms. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20210903-220351518

[img] PDF - Submitted Version
See Usage Policy.

993kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20210903-220351518

Abstract

Thompson sampling for multi-armed bandit problems is known to enjoy favorable performance in both theory and practice. However, it suffers from a significant limitation computationally, arising from the need for samples from posterior distributions at every iteration. We propose two Markov Chain Monte Carlo (MCMC) methods tailored to Thompson sampling to address this issue. We construct quickly converging Langevin algorithms to generate approximate samples that have accuracy guarantees, and we leverage novel posterior concentration rates to analyze the regret of the resulting approximate Thompson sampling algorithm. Further, we specify the necessary hyperparameters for the MCMC procedure to guarantee optimal instance-dependent frequentist regret while having low computational complexity. In particular, our algorithms take advantage of both posterior concentration and a sample reuse mechanism to ensure that only a constant number of iterations and a constant amount of data is needed in each round. The resulting approximate Thompson sampling algorithm has logarithmic regret and its computational complexity does not scale with the time horizon of the algorithm.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
http://arxiv.org/abs/2002.10002arXivDiscussion Paper
ORCID:
AuthorORCID
Mazumdar, Eric0000-0002-1815-269X
Ma, Yi-an0000-0001-6074-6638
Bartlett, Peter L.0000-0002-8760-3140
Jordan, Michael I.0000-0001-8935-817X
Record Number:CaltechAUTHORS:20210903-220351518
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20210903-220351518
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:110727
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:07 Sep 2021 15:55
Last Modified:07 Sep 2021 19:41

Repository Staff Only: item control page