Judd, Stephen J. (1988) Neural Network Design and the Complexity of Learning. California Institute of Technology . (Unpublished) http://resolver.caltech.edu/CaltechCSTR:1988.cs-tr-88-20
See Usage Policy.
Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechCSTR:1988.cs-tr-88-20
We formalize a notion of learning that characterizes the training of feed-forward networks. In the field of learning theory, it stands as a new model specialized for the type of learning problems that arise in connectionist networks. The formulation is similar to Valiant's [Va184] in that we ask what can be feasibly learned from examples and stored in a particular data structure. One can view the data structure resulting from Valiant-type learning as a 'sentence' in a language described by grammatical syntax rules. Neither the words nor their interrelationships are known a priori. Our learned data structure is more particular than Valiant's in that it must be a particular 'sentence'. The position and relationships of each 'word' are fully specified in advance and the learning system need only discover what the missing words are. This corresponds to the problem of finding retrieval functions for each node in a given network. We prove this problem NP-complete and thus demonstrate that learning in networks has no efficient general solution. Corollaries to the main theorem demonstrate the NP-completeness of several sub-cases. While the intractability of the problem precludes its solution in all these cases, we sketch some alternative definitions of the problem in a search for tractable sub-cases. One broad class of subcases is formed by placing constraints on the network architecture; we study one type in particular. The focus of these constraints is on families of 'shallow' architectures which are defined to have bounded depth and unbounded width. We introduce a perspective on shallow networks, called the Support Cone Interaction (SCI) graph, which is helpful in distinguishing tractable from intractable subcases: When the SC1 graph has tree-width O (log n), learning can be accomplished in polynomial time; when its tree-width is n omega (1) we find the problem NP-complete even if the SC1 graph is a simple 2-dimensional planar grid.
|Item Type:||Report or Paper (Technical Report)|
|Group:||Computer Science Technical Reports|
|Usage Policy:||You are granted permission for individual, educational, research and non-commercial reproduction, distribution, display and performance of this work in any format.|
|Deposited By:||Imported from CaltechCSTR|
|Deposited On:||25 Apr 2001|
|Last Modified:||26 Dec 2012 14:02|
Repository Staff Only: item control page