Decision tree design from a communication theory standpoint
- Creators
- Goodman, Rodney M.
- Smyth, Padhraic
Abstract
A communication theory approach to decision tree design based on a top-town mutual information algorithm is presented. It is shown that this algorithm is equivalent to a form of Shannon-Fano prefix coding, and several fundamental bounds relating decision-tree parameters are derived. The bounds are used in conjunction with a rate-distortion interpretation of tree design to explain several phenomena previously observed in practical decision-tree design. A termination rule for the algorithm called the delta-entropy rule is proposed that improves its robustness in the presence of noise. Simulation results are presented, showing that the tree classifiers derived by the algorithm compare favourably to the single nearest neighbour classifier.
Additional Information
© 1988 IEEE. Manuscript received March 23, 1987; revised January 27, 1988. This work was supported in part by Pacific Bell. This paper was presented in part at the 1986 IEEE International Symposium on Information Theory, Ann Arbor, MI, October 1986.Attached Files
Published - 00021221.pdf
Files
Name | Size | Download all |
---|---|---|
md5:3e459a16fdb2813d5e16664e9d26a2ca
|
1.4 MB | Preview Download |
Additional details
- Eprint ID
- 93822
- Resolver ID
- CaltechAUTHORS:20190314-130609598
- Pacific Bell
- Created
-
2019-03-14Created from EPrint's datestamp field
- Updated
-
2021-11-16Created from EPrint's last_modified field