Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published July 1986 | Published
Journal Article Open

The complexity of information extraction


How difficult are decision problems based on natural data, such as pattern recognition? To answer this question, decision problems are characterized by introducing four measures defined on a Boolean function f of N variables: the implementation cost C(f), the randomness R(f), the deterministic entropy H(f), and the complexity K(f). The highlights and main results are roughly as follows, 1) C(f) approx R(f) H(f) approx K(f), all measured in bits. 2) Decision problems based on natural data are partially random (in the Kolmogorov sense) and have low entropy with respect to their dimensionality, and the relations between the four measures translate to lower and upper bounds on the cost of solving these problems. 3) Allowing small errors in the implementation of f saves a lot in the low entropy case but saves nothing in the high-entropy case. If f is partially structured, the implementation cost is reduced substantially.

Additional Information

© 1986 IEEE. Reprinted with permission. Manuscript received August 8, 1984; revised October 23, 1985. This paper was presented in part at the IEEE International Symposium on Information Theory, Brighton, England, June 1985. I gratefully acknowledge Dr. D. Psaltis, who contributed significantly to this work. I also wish to thank Drs. R. McEliece, C. Mead, and E. Posner for their help. This paper is dedicated to the memory of Herbert J. Ryser.

Attached Files

Published - ABUieeetit86.pdf


Files (2.4 MB)
Name Size Download all
2.4 MB Preview Download

Additional details

August 22, 2023
October 16, 2023