CaltechAUTHORS
  A Caltech Library Service

Tensor Regression Networks

Kossaifi, Jean and Lipton, Zachary C. and Kolbeinsson, Arinbjörn and Khanna, Aran and Furlanello, Tommaso and Anandkumar, Anima (2020) Tensor Regression Networks. Journal of Machine Learning Research, 21 . pp. 1-21. ISSN 1533-7928. https://resolver.caltech.edu/CaltechAUTHORS:20190327-085728859

[img] PDF - Published Version
Creative Commons Attribution.

780Kb
[img] PDF - Submitted Version
See Usage Policy.

791Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190327-085728859

Abstract

Convolutional neural networks typically consist of many convolutional layers followed by one or more fully connected layers. While convolutional layers map between high-order activation tensors, the fully connected layers operate on flattened activation vectors. Despite empirical success, this approach has notable drawbacks. Flattening followed by fully connected layers discards multilinear structure in the activations and requires many parameters. We address these problems by incorporating tensor algebraic operations that preserve multilinear structure at every layer. First, we introduce Tensor Contraction Layers (TCLs) that reduce the dimensionality of their input while preserving their multilinear structure using tensor contraction. Next, we introduce Tensor Regression Layers (TRLs), which express outputs through a low-rank multilinear mapping from a high-order activation tensor to an output tensor of arbitrary order. We learn the contraction and regression factors end-to-end, and produce accurate nets with fewer parameters. Additionally, our layers regularize networks by imposing low-rank constraints on the activations (TCL) and regression weights (TRL). Experiments on ImageNet show that, applied to VGG and ResNet architectures, TCLs and TRLs reduce the number of parameters compared to fully connected layers by more than 65% while maintaining or increasing accuracy. In addition to the space savings, our approach's ability to leverage topological structure can be crucial for structured data such as MRI. In particular, we demonstrate significant performance improvements over comparable architectures on three tasks associated with the UK Biobank dataset.


Item Type:Article
Related URLs:
URLURL TypeDescription
https://www.jmlr.org/papers/v21/18-503.htmlPublisherArticle
https://arxiv.org/abs/1707.08308arXivDiscussion Paper
Additional Information:© 2020 Jean Kossaifi, Zachary C. Lipton, Arinbjörn Kolbeinsson, Aran Khanna, Tommaso Furlanello and Anima Anandkumar. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/18-503.html. Submitted 7/18; Published 7/20. This research has been conducted using the UK Biobank Resource under Application Number 18545. The authors would like to thank the editor and anonymous reviewers for the constructive feedback which helped improve this manuscript.
Subject Keywords:Machine Learning, Tensor Methods, Tensor Regression Networks, Low-Rank Regression, Tensor Regression Layers, Deep Learning, Tensor Contraction
Record Number:CaltechAUTHORS:20190327-085728859
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190327-085728859
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:94168
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:28 Mar 2019 22:05
Last Modified:20 Aug 2020 17:46

Repository Staff Only: item control page