CaltechAUTHORS
  A Caltech Library Service

A Characterization of Stochastic Mirror Descent Algorithms and Their Convergence Properties

Azizan, Navid and Hassibi, Babak (2019) A Characterization of Stochastic Mirror Descent Algorithms and Their Convergence Properties. In: 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE , Piscataway, NJ, pp. 5167-5171. ISBN 978-1-5386-4658-8. https://resolver.caltech.edu/CaltechAUTHORS:20190425-082644083

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190425-082644083

Abstract

Stochastic mirror descent (SMD) algorithms have recently garnered a great deal of attention in optimization, signal processing, and machine learning. They are similar to stochastic gradient descent (SGD), in that they perform updates along the negative gradient of an instantaneous (or stochastically chosen) loss function. However, rather than update the parameter (or weight) vector directly, they update it in a "mirrored" domain whose transformation is given by the gradient of a strictly convex differentiable potential function. SMD was originally conceived to take advantage of the underlying geometry of the problem as a way to improve the convergence rate over SGD. In this paper, we study SMD, for linear models and convex loss functions, through the lens of H∞ estimation theory and come up with a minimax interpretation of the SMD algorithm which is the counterpart of the H∞ -optimality of the SGD algorithm for linear models and quadratic loss. In doing so, we identify a fundamental conservation law that SMD satisfies and use it to study the convergence properties of the algorithm. For constant step size SMD, when the linear model is over-parameterized, we give a deterministic proof of convergence for SMD and show that from any initial point, it converges to the closest point in the space of all parameter vectors that interpolate the data, where closest is in the sense of the Bregman divergence of the potential function. This property is referred to as implicit regularization: with an appropriate choice of the potential function one can guarantee convergence to the minimizer of any desired convex regularizer. For vanishing step size SMD, and in the standard stochastic optimization setting, we give a direct and elementary proof of convergence for SMD to the "true" parameter vector which avoids ergodic averaging or appealing to stochastic differential equations.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
https://doi.org/10.1109/ICASSP.2019.8682271DOIArticle
ORCID:
AuthorORCID
Azizan, Navid0000-0002-4299-2963
Additional Information:© 2019 IEEE.
Subject Keywords:Stochastic gradient descent, mirror descent, minimax optimality, convergence, implicit regularization
Record Number:CaltechAUTHORS:20190425-082644083
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190425-082644083
Official Citation:N. Azizan and B. Hassibi, "A Characterization of Stochastic Mirror Descent Algorithms and Their Convergence Properties," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 5167-5171. doi: 10.1109/ICASSP.2019.8682271
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:94961
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:25 Apr 2019 16:00
Last Modified:03 Oct 2019 21:09

Repository Staff Only: item control page