Azizan, Navid and Lale, Sahin and Hassibi, Babak (2020) A Study of Generalization of Stochastic Mirror Descent Algorithms on Overparameterized Nonlinear Models. In: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE , Piscataway, NJ, pp. 3132-3136. ISBN 9781509066315. https://resolver.caltech.edu/CaltechAUTHORS:20200417-131039768
Full text is not posted in this repository. Consult Related URLs below.
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20200417-131039768
Abstract
We study the convergence, the implicit regularization and the generalization of stochastic mirror descent (SMD) algorithms in overparameterized nonlinear models, where the number of model parameters exceeds the number of training data points. Due to overpa-rameterization, the training loss has infinitely many global minima where they define a manifold of interpolating solutions. To have an understanding of the generalization performance of SMD algorithms, it is important to characterize which global minima the SMD algorithms converge to. In this work, we first theoretically show that in the overparameterized nonlinear setting, if the initialization is close enough to the manifold of global minima, which is usually the case in the high overparameterization setting, using sufficiently small step size, SMD converges to a global minimum. We further prove that this global minimum is approximately the closest one to the initialization in Bregman divergence, demonstrating the approximate implicit regularization of SMD. We then empirically confirm that these theoretical results are observed in practice. Finally, we provide an extensive study of the generalization of SMD algorithms. In our experiments, we show that on the CIFAR-10 dataset, SMD with ℓ₁₀ norm potential (as a surrogate for ℓ∞ ) consistently general-izes better than SGD (corresponding to an ℓ₂ norm potential), which in turn consistently outperforms SMD with ℓ₁ norm potential.
Item Type: | Book Section | ||||||||
---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| ||||||||
ORCID: |
| ||||||||
Additional Information: | © 2020 IEEE. | ||||||||
Subject Keywords: | Stochastic mirror descent, nonlinear models, convergence, implicit regularization, generalization | ||||||||
DOI: | 10.1109/icassp40776.2020.9053864 | ||||||||
Record Number: | CaltechAUTHORS:20200417-131039768 | ||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20200417-131039768 | ||||||||
Official Citation: | N. Azizan, S. Lale and B. Hassibi, "A Study of Generalization of Stochastic Mirror Descent Algorithms on Overparameterized Nonlinear Models," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 3132-3136 | ||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||||
ID Code: | 102604 | ||||||||
Collection: | CaltechAUTHORS | ||||||||
Deposited By: | Tony Diaz | ||||||||
Deposited On: | 17 Apr 2020 20:16 | ||||||||
Last Modified: | 23 Dec 2022 00:50 |
Repository Staff Only: item control page