CaltechAUTHORS
  A Caltech Library Service

Denumerable-Armed Bandits

Banks, Jeffrey S. and Sundaram, Rangarajan K. (1992) Denumerable-Armed Bandits. Econometrica, 60 (5). pp. 1071-1096. ISSN 1468-0262. doi:10.2307/2951539. https://resolver.caltech.edu/CaltechAUTHORS:20160525-072126405

[img] PDF - Published Version
See Usage Policy.

34MB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20160525-072126405

Abstract

This paper studies the class of denumerable-armed (i.e. finite- or countably infinitearmed) bandit problems with independent arms and geometric discounting over an infinite horizon, in which each arm generates rewards according to one of a finite number of distributions, or "types." The number of types in the support of an arm, as also the types themselves, are allowed to vary across the arms. We derive certain continuity and curvature properties of the dynamic allocation (or Gittins) index of Gittins and Jones (1974), and provide necessary and sufficient conditions under which the Gittins-Jones result identifying all optimal strategies for finite-armed bandits may be extended to infinite-armed bandits. We then establish our central result: at each point in time, the arm selected by an optimal strategy will, with strictly positive probability, remain an optimal selection forever. More specifically, for every such arm, there exists (at least) one type of that arm such that, when conditioned on that type being the arm's "true" type, the arm will survive forever and continuously with nonzero probability. When the reward distributions of an arm satisfy the monotone likelihood ratio property (MLRP), the survival prospects of an arm improve when conditioned on types generating higher expected rewards; however, we show how this need not be the case in the absence of MLRP. Implications of these results are derived for the theories of job search and matching, as well as other applications of the bandit paradigm.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://dx.doi.org/10.2307/2951539DOIArticle
http://www.jstor.org/stable/2951539JSTORArticle
Additional Information:© 1992 The Econometric Society. This paper has benefited immeasurably from the detailed comments and suggestions offered by Andy McLennan as a (then anonymous) referee. We are also very grateful to Martin Hellwig and two other anonymous referees for their careful reading of earlier drafts and their suggestions; as also to seminar audiences at Buffalo, Caltech, Carnegie-Mellon, Columbia, George Mason, Harvard, Hoover Institution, McMaster, Michigan, Johns Hopkins, Rochester, Toronto, UCLA, UC-Riverside, Washington-St. Louis, Western Ontario, and Yale, for their input. In particular, we would like to thank Prajit Dutta, Mahmoud El-Gamal, Nicholas Kiefer, David Levine, and Bill Zame. The first author gratefully acknowledges financial support provided by the Sloan Foundation and the NSF. The final draft of this paper was completed when the second author was visiting the California Institute of Technology, and he would like to thank them for their hospitality. With transparent modifications, all of our results remain valid if, instead of reward densities, we had discrete reward distributions (i.e., those with finite or countable support).
Funders:
Funding AgencyGrant Number
Alfred P. Sloan FoundationUNSPECIFIED
NSFUNSPECIFIED
Subject Keywords:Bandits, Gittins index, survival, monotone likelihood ratio property, stationary bandits, job search
Issue or Number:5
DOI:10.2307/2951539
Record Number:CaltechAUTHORS:20160525-072126405
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20160525-072126405
Official Citation:Banks, Jeffrey S., and Sundaram Rangarajan K. "Denumerable-Armed Bandits." Econometrica 60, no. 5 (1992): 1071-096.
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:67326
Collection:CaltechAUTHORS
Deposited By: Ruth Sustaita
Deposited On:26 May 2016 20:58
Last Modified:11 Nov 2021 00:30

Repository Staff Only: item control page