CaltechAUTHORS
  A Caltech Library Service

On the distance between two neural networks and the stability of learning

Bernstein, Jeremy and Vahdat, Arash and Yue, Yisong and Liu, Ming-Yu (2020) On the distance between two neural networks and the stability of learning. In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020). Neural Information Processing Foundation , La Jolla, CA, pp. 1-12. ISBN 9781713829546. https://resolver.caltech.edu/CaltechAUTHORS:20221222-180007021

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20221222-180007021

Abstract

This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions. The analysis leads to a new distance function called deep relative trust and a descent lemma for neural networks. Since the resulting learning rule seems to require little to no learning rate tuning, it may unlock a simpler workflow for training deeper and more complex neural networks. The Python code used in this paper is here: https://github.com/jxbz/fromage.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
https://proceedings.neurips.cc/paper/2020/hash/f4b31bee138ff5f7b84ce1575a738f95-Abstract.htmlPublisherArticle
https://resolver.caltech.edu/CaltechAUTHORS:20200214-105602886Related ItemDiscussion Paper
ORCID:
AuthorORCID
Bernstein, Jeremy0000-0001-9110-7476
Yue, Yisong0000-0001-9127-1989
Liu, Ming-Yu0000-0002-2951-2398
Additional Information:The authors would like to thank Rumen Dangovski, Dillon Huff, Jeffrey Pennington, Florian Schaefer and Joel Tropp for useful conversations. They made heavy use of a codebase built by Jiahui Yu. They are grateful to Sivakumar Arayandi Thottakara, Jan Kautz, Sabu Nadarajan and Nithya Natesan for infrastructure support. JB was supported by an NVIDIA fellowship, and this work was funded in part by NASA.
Funders:
Funding AgencyGrant Number
NVIDIA CorporationUNSPECIFIED
NASAUNSPECIFIED
Record Number:CaltechAUTHORS:20221222-180007021
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20221222-180007021
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:118580
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:22 Dec 2022 22:35
Last Modified:22 Dec 2022 22:35

Repository Staff Only: item control page