Optimal Brain Surgeon: Extensions and performance comparison.
We extend Optimal Brain Surgeon (OBS) - a second-order method for pruning networks - to allow for general error measures, and explore a reduced computational and storage implementation via a dominant eigenspace decomposition. Simulations on nonlinear, noisy pattern classification problems reveal that OBS does lead to improved generalization, and performs favorably in comparison with Optimal Brain Damage (OBD). We find that the required retraining steps in OBD may lead to inferior generalization, a result that can be interpreted as due to injecting noise back into the system. A common technique is to stop training of a large network at the minimum validation error. We found that the test error could be reduced even further by means of OBS (but not OBD) pruning. Our results justify the t → o approximation used in OBS and indicate why retraining in a highly pruned network may lead to inferior performance.
Copyright 1994. Thanks to T. Kailath for support of B.H. through grants AFOSR 91-0060 and DAAL03-91-C-0010.
||1.7 MB||Preview Download|