CaltechAUTHORS
  A Caltech Library Service

Stochastic Mirror Descent in Average Ensemble Models

Kargin, Taylan and Salehi, Fariborz and Hassibi, Babak (2022) Stochastic Mirror Descent in Average Ensemble Models. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20221222-234253993

[img] PDF - Submitted Version
Creative Commons Attribution.

1MB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20221222-234253993

Abstract

The stochastic mirror descent (SMD) algorithm is a general class of training algorithms, which includes the celebrated stochastic gradient descent (SGD), as a special case. It utilizes a mirror potential to influence the implicit bias of the training algorithm. In this paper we explore the performance of the SMD iterates on mean-field ensemble models. Our results generalize earlier ones obtained for SGD on such models. The evolution of the distribution of parameters is mapped to a continuous time process in the space of probability distributions. Our main result gives a nonlinear partial differential equation to which the continuous time process converges in the asymptotic regime of large networks. The impact of the mirror potential appears through a multiplicative term that is equal to the inverse of its Hessian and which can be interpreted as defining a gradient flow over an appropriately defined Riemannian manifold. We provide numerical simulations which allow us to study and characterize the effect of the mirror potential on the performance of networks trained with SMD for some binary classification problems.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
http://arxiv.org/abs/2210.15323arXivDiscussion Paper
ORCID:
AuthorORCID
Kargin, Taylan0000-0001-6744-654X
Hassibi, Babak0000-0002-1375-5838
Additional Information:Attribution 4.0 International (CC BY 4.0).
DOI:10.48550/arXiv.2210.15323
Record Number:CaltechAUTHORS:20221222-234253993
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20221222-234253993
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:118604
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:23 Dec 2022 20:35
Last Modified:02 Jun 2023 01:29

Repository Staff Only: item control page