Improving decision tree performance through induction and cluster-based stratified sampling

Gill, Abdul A., Smith, George D. and Bagnall, Anthony J. (2004) Improving decision tree performance through induction and cluster-based stratified sampling. In: Intelligent Data Engineering and Automated Learning – IDEAL 2004. Lecture Notes in Computer Science, 3177 . Springer, pp. 339-344. ISBN 978-3-540-22881-3

Full text not available from this repository. (Request a copy)


It is generally recognised that recursive partitioning, as used in the construction of classification trees, is inherently unstable, particularly for small data sets. Classification accuracy and, by implication, tree structure, are sensitive to changes in the training data. Successful approaches to counteract this effect include multiple classifiers, e.g. boosting, bagging or windowing. The downside of these multiple classification models, however, is the plethora of trees that result, often making it difficult to extract the classifier in a meaningful manner. We show that, by using some very weak knowledge in the sampling stage, when the data set is partitioned into the training and test sets, a more consistent and improved performance is achieved by a single decision tree classifier.

Item Type: Book Section
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Data Science and Statistics
Depositing User: Vishal Gautam
Date Deposited: 20 Jul 2011 20:55
Last Modified: 23 Oct 2022 23:58
DOI: 10.1007/978-3-540-28651-6_50

Actions (login required)

View Item View Item