Chen, Zaiwei and Zhang, Kaiqing and Mazumdar, Eric and Ozdaglar, Asuman and Wierman, Adam (2023) A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20230316-204015123
![]() |
PDF
- Submitted Version
Creative Commons Attribution. 734kB |
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20230316-204015123
Abstract
We study two-player zero-sum stochastic games, and propose a form of independent learning dynamics called Doubly Smoothed Best-Response dynamics, which integrates a discrete and doubly smoothed variant of the best-response dynamics into temporal-difference (TD)-learning and minimax value iteration. The resulting dynamics are payoff-based, convergent, rational, and symmetric among players. Our main results provide finite-sample guarantees. In particular, we prove the first-known O̅(1/ϵ²) sample complexity bound for payoff-based independent learning dynamics, up to a smoothing bias. In the special case where the stochastic game has only one state (i.e., matrix games), we provide a sharper O̅(1/ϵ) sample complexity. Our analysis uses a novel coupled Lyapunov drift approach to capture the evolution of multiple sets of coupled and stochastic iterates, which might be of independent interest.
Item Type: | Report or Paper (Discussion Paper) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| ||||||||||
ORCID: |
| ||||||||||
Additional Information: | Attribution 4.0 International (CC BY 4.0) | ||||||||||
Record Number: | CaltechAUTHORS:20230316-204015123 | ||||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20230316-204015123 | ||||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||||||
ID Code: | 120097 | ||||||||||
Collection: | CaltechAUTHORS | ||||||||||
Deposited By: | George Porter | ||||||||||
Deposited On: | 16 Mar 2023 22:56 | ||||||||||
Last Modified: | 16 Mar 2023 22:56 |
Repository Staff Only: item control page