CaltechAUTHORS
  A Caltech Library Service

Computing the Information Content of Trained Neural Networks

Bernstein, Jeremy and Yue, Yisong (2021) Computing the Information Content of Trained Neural Networks. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20210304-095340677

[img] PDF - Submitted Version
See Usage Policy.

597Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20210304-095340677

Abstract

How much information does a learning algorithm extract from the training data and store in a neural network's weights? Too much, and the network would overfit to the training data. Too little, and the network would not fit to anything at all. Naïvely, the amount of information the network stores should scale in proportion to the number of trainable weights. This raises the question: how can neural networks with vastly more weights than training data still generalise? A simple resolution to this conundrum is that the number of weights is usually a bad proxy for the actual amount of information stored. For instance, typical weight vectors may be highly compressible. Then another question occurs: is it possible to compute the actual amount of information stored? This paper derives both a consistent estimator and a closed-form upper bound on the information content of infinitely wide neural networks. The derivation is based on an identification between neural information content and the negative log probability of a Gaussian orthant. This identification yields bounds that analytically control the generalisation behaviour of the entire solution space of infinitely wide networks. The bounds have a simple dependence on both the network architecture and the training data. Corroborating the findings of Valle-Pérez et al. (2019), who conducted a similar analysis using approximate Gaussian integration techniques, the bounds are found to be both non-vacuous and correlated with the empirical generalisation behaviour at finite width.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
http://arxiv.org/abs/2103.01045arXivDiscussion Paper
ORCID:
AuthorORCID
Bernstein, Jeremy0000-0001-9110-7476
Yue, Yisong0000-0001-9127-1989
Record Number:CaltechAUTHORS:20210304-095340677
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20210304-095340677
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:108309
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:04 Mar 2021 21:33
Last Modified:04 Mar 2021 21:33

Repository Staff Only: item control page