CaltechAUTHORS
  A Caltech Library Service

A Study of Generalization of Stochastic Mirror Descent Algorithms on Overparameterized Nonlinear Models

Azizan, Navid and Lale, Sahin and Hassibi, Babak (2020) A Study of Generalization of Stochastic Mirror Descent Algorithms on Overparameterized Nonlinear Models. In: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE , Piscataway, NJ, pp. 3132-3136. ISBN 9781509066315. https://resolver.caltech.edu/CaltechAUTHORS:20200417-131039768

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20200417-131039768

Abstract

We study the convergence, the implicit regularization and the generalization of stochastic mirror descent (SMD) algorithms in overparameterized nonlinear models, where the number of model parameters exceeds the number of training data points. Due to overpa-rameterization, the training loss has infinitely many global minima where they define a manifold of interpolating solutions. To have an understanding of the generalization performance of SMD algorithms, it is important to characterize which global minima the SMD algorithms converge to. In this work, we first theoretically show that in the overparameterized nonlinear setting, if the initialization is close enough to the manifold of global minima, which is usually the case in the high overparameterization setting, using sufficiently small step size, SMD converges to a global minimum. We further prove that this global minimum is approximately the closest one to the initialization in Bregman divergence, demonstrating the approximate implicit regularization of SMD. We then empirically confirm that these theoretical results are observed in practice. Finally, we provide an extensive study of the generalization of SMD algorithms. In our experiments, we show that on the CIFAR-10 dataset, SMD with ℓ₁₀ norm potential (as a surrogate for ℓ∞ ) consistently general-izes better than SGD (corresponding to an ℓ₂ norm potential), which in turn consistently outperforms SMD with ℓ₁ norm potential.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
https://doi.org/10.1109/icassp40776.2020.9053864DOIArticle
ORCID:
AuthorORCID
Azizan, Navid0000-0002-4299-2963
Lale, Sahin0000-0002-7191-346X
Hassibi, Babak0000-0002-1375-5838
Additional Information:© 2020 IEEE.
Subject Keywords:Stochastic mirror descent, nonlinear models, convergence, implicit regularization, generalization
DOI:10.1109/icassp40776.2020.9053864
Record Number:CaltechAUTHORS:20200417-131039768
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20200417-131039768
Official Citation:N. Azizan, S. Lale and B. Hassibi, "A Study of Generalization of Stochastic Mirror Descent Algorithms on Overparameterized Nonlinear Models," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 3132-3136
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:102604
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:17 Apr 2020 20:16
Last Modified:23 Dec 2022 00:50

Repository Staff Only: item control page