CaltechAUTHORS
  A Caltech Library Service

The complexity of information extraction

Abu-Mostafa, Yaser S. (1986) The complexity of information extraction. IEEE Transactions on Information Theory, 32 (4). pp. 513-525. ISSN 0018-9448. http://resolver.caltech.edu/CaltechAUTHORS:ABUieeetit86

[img]
Preview
PDF
See Usage Policy.

2296Kb

Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:ABUieeetit86

Abstract

How difficult are decision problems based on natural data, such as pattern recognition? To answer this question, decision problems are characterized by introducing four measures defined on a Boolean function f of N variables: the implementation cost C(f), the randomness R(f), the deterministic entropy H(f), and the complexity K(f). The highlights and main results are roughly as follows, 1) C(f) approx R(f) H(f) approx K(f), all measured in bits. 2) Decision problems based on natural data are partially random (in the Kolmogorov sense) and have low entropy with respect to their dimensionality, and the relations between the four measures translate to lower and upper bounds on the cost of solving these problems. 3) Allowing small errors in the implementation of f saves a lot in the low entropy case but saves nothing in the high-entropy case. If f is partially structured, the implementation cost is reduced substantially.


Item Type:Article
Additional Information:© Copyright 1986 IEEE. Reprinted with permission. Manuscript received August 8, 1984; revised October 23, 1985. This paper was presented in part at the IEEE International Symposium on Information Theory, Brighton, England, June 1985. I gratefully acknowledge Dr. D. Psaltis, who contributed significantly to this work. I also wish to thank Drs. R. McEliece, C. Mead, and E. Posner for their help. This paper is dedicated to the memory of Herbert J. Ryser.
Record Number:CaltechAUTHORS:ABUieeetit86
Persistent URL:http://resolver.caltech.edu/CaltechAUTHORS:ABUieeetit86
Alternative URL:http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=22755&arnumber=1057209&count=19&index=16
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:7009
Collection:CaltechAUTHORS
Deposited By: Archive Administrator
Deposited On:05 Jan 2007
Last Modified:26 Dec 2012 09:27

Repository Staff Only: item control page