Richards, G., Brazier, K. J. and Wang, W. (2006) Feature salience definition and estimation and its use in feature subset selection. Intelligent Data Analysis, 10 (1). pp. 3-21. ISSN 1088-467X
Full text not available from this repository. (Request a copy)Abstract
In this paper we describe novel feature subset selection methods, based on the estimation of feature salience i.e. the quantification of the relative importance of individual features, in the presence of other features, for determining the classes of records in a dataset. We present a definition of what we mean by feature salience and a method for estimating this feature salience. Five synthetic datasets were used to demonstrate the utility of the salience estimation technique. It was found that the estimation techniques produced good approximations to the calculated saliencies in most cases. The use of feature salience as the basis of three methods of feature subset selection is described. These methods were evaluated on real world data sets by constructing classifiers using all features and comparing these with classifiers constructed using only a selected subset of features. It was found that the results compared well with other state of the art techniques and that the methods were simpler to implement and significantly faster to execute. On average, applying our best feature subset selection method resulted in trees that used only 49% of the features used by trees constructed with the full set of features. This reduction in number of features used was associated with a 1% improvement in classifier accuracy.
Item Type: | Article |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Data Science and Statistics |
Depositing User: | Vishal Gautam |
Date Deposited: | 01 Jun 2011 19:34 |
Last Modified: | 09 Jan 2024 01:23 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/22400 |
DOI: | 10.3233/IDA-2006-10102 |
Actions (login required)
View Item |