CaltechAUTHORS
  A Caltech Library Service

Second Order Derivatives for Network Pruning: Optimal Brain Surgeon

Hassibi, Babak and Stork, David G. (1993) Second Order Derivatives for Network Pruning: Optimal Brain Surgeon. In: Advances in Neural Information Processing Systems 5 (NIPS 1992). Morgan Kaufmann , San Mateo, CA, pp. 164-171. ISBN 1558602747. https://resolver.caltech.edu/CaltechAUTHORS:20150219-075206704

[img] PDF - Published Version
See Usage Policy.

1925Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20150219-075206704

Abstract

We investigate the use of information from all second order derivatives of the error function to perform network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and in some cases enable rule extraction. Our method, Optimal Brain Surgeon (OBS), is Significantly better than magnitude-based methods and Optimal Brain Damage [Le Cun, Denker and Sol1a, 1990], which often remove the wrong weights. OBS permits the pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix H^(-1) from training data and structural information of the net. OBS permits a 90%, a 76%, and a 62% reduction in weights over backpropagation with weigh decay on three benchmark MONK's problems [Thrun et aI., 1991]. Of OBS, Optimal Brain Damage, and magnitude-based methods, only OBS deletes the correct weights from a trained XOR network in every case. Finally, whereas Sejnowski and Rosenberg [1987J used 18,000 weights in their NETtalk network, we used OBS to prune a network to just 1560 weights, yielding better generalization.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
http://papers.nips.cc/paper/647-second-order-derivatives-for-network-pruning-optimal-brain-surgeonOrganizationArticle
Additional Information:© 1993 Morgan Kaufmann. The first author was supported in part by grants AFOSR 91-0060 and DAAL03-91-C-0010 to T. Kailath, who in tum provided constant encouragement Deep thanks go to Greg Wolff (Ricoh) for assistance with simulations and analysis, and Jerome Friedman (Stanford) for pointers to relevant statistics literature.
Funders:
Funding AgencyGrant Number
Air Force Office of Scientific Research (AFOSR)91-0060
Army Research Office (ARO)DAAL03-91-C-0010
Record Number:CaltechAUTHORS:20150219-075206704
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20150219-075206704
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:54983
Collection:CaltechAUTHORS
Deposited By: Shirley Slattery
Deposited On:27 Feb 2015 00:14
Last Modified:03 Oct 2019 08:02

Repository Staff Only: item control page