Bellman strikes again!: the growth rate of sample complexity with dimension for the nearest neighbor classifier
- Other:
- Haussler, David
Abstract
The finite sample performance of a nearest neighbor classifier is analyzed for a two-class pattern recognition problem. An exact integral expression is derived for the m-sample risk R_m given that a reference m-sample of labeled points, drawn independently from Euclidean n-space according to a fixed probability distribution, is available to the classifier. For a family of smooth distributions, it is shown that the m-sample risk R_m has a complete asymptotic expansion R_m ~ R_∞ + Σ^∞_(k=1) c_(2k)m^(-2k/n), where R_∞ denotes the nearest neighbor risk in the infinite sample limit. Explicit definitions of the expansion coefficients are given in terms of the underlying distribution. As the convergence rate of R_m → R_∞ dramatically slows down as n increases, this analysis provides an analytic validation of Bellman's curse of dimensionality. Numerical simulations corroborating the formal results are included. The rates of convergence for less restrictive families of distributions are also discussed.
Additional Information
© 1992 ACM.Additional details
- Eprint ID
- 73195
- DOI
- 10.1145/130385.130396
- Resolver ID
- CaltechAUTHORS:20170103-173502617
- Created
-
2017-01-04Created from EPrint's datestamp field
- Updated
-
2021-11-11Created from EPrint's last_modified field