CaltechAUTHORS
  A Caltech Library Service

Learning by Turning: Neural Architecture Aware Optimisation

Liu, Yang and Bernstein, Jeremy and Meister, Markus and Yue, Yisong (2021) Learning by Turning: Neural Architecture Aware Optimisation. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20210225-132711583

[img] PDF - Submitted Version
See Usage Policy.

1MB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20210225-132711583

Abstract

Descent methods for deep networks are notoriously capricious: they require careful tuning of step size, momentum and weight decay, and which method will work best on a new benchmark is a priori unclear. To address this problem, this paper conducts a combined study of neural architecture and optimisation, leading to a new optimiser called Nero: the neuronal rotator. Nero trains reliably without momentum or weight decay, works in situations where Adam and SGD fail, and requires little to no learning rate tuning. Also, Nero's memory footprint is ~ square root that of Adam or LAMB. Nero combines two ideas: (1) projected gradient descent over the space of balanced networks; (2) neuron-specific updates, where the step size sets the angle through which each neuron's hyperplane turns. The paper concludes by discussing how this geometric connection between architecture and optimisation may impact theories of generalisation in deep learning.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
http://arxiv.org/abs/2102.07227arXivDiscussion Paper
ORCID:
AuthorORCID
Liu, Yang0000-0002-8155-9134
Bernstein, Jeremy0000-0001-9110-7476
Meister, Markus0000-0003-2136-6506
Yue, Yisong0000-0001-9127-1989
Record Number:CaltechAUTHORS:20210225-132711583
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20210225-132711583
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:108202
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:26 Feb 2021 15:34
Last Modified:26 Feb 2021 15:34

Repository Staff Only: item control page