A hybrid classification method:Discrete canonical variate analysis using a genetic algorithm

Kemsley, E. K. ORCID: https://orcid.org/0000-0003-0669-3883 (2001) A hybrid classification method:Discrete canonical variate analysis using a genetic algorithm. Chemometrics and Intelligent Laboratory Systems, 55 (1-2). pp. 39-51. ISSN 0169-7439

Full text not available from this repository.


This paper describes a novel, hybrid multivariate classification method: discrete canonical variate analysis (DCVA), which is integrated in the present implementation with a genetic algorithm (GA). DCVA transforms a multivariate data set into a set of discrete scores of lower dimensionality, intended specifically to act as classifiers of observations into one out of multiple pre-defined groups. The condition for selecting the DCVA loadings is maximization of the ratio of the between-groups to within-groups variance of the scores, but unlike conventional CVA, there is a non-linear, discontinuous relationship between the scores and loadings. The performance of the DCVA method is compared with that of two competing classification methods, Artificial Neural Networks (ANNs) and Mahalanobis distance-based Linear discriminant analysis (LDA) using six example problems. In all cases, internal (leave-one-out) cross-validation was used, and classification success rates retained from both the training and test segments. Of the methods studied, DCVA clearly performed the best in training, producing the highest mean success rates for four out of the six example data sets. For the test segments, DCVA produced the best performance for two of the data sets, and equalled that of LDA and ANN for a third. However, LDA produced the best performance from the remaining three data sets. This is suggestive of a greater tendency of DCVA, like other search-based methods, to overfit.

Item Type: Article
Additional Information: Funding Information: This work was funded by the Biotechnology and Biological Sciences Research Council. The author thanks A. Parr for providing some of the example data, and P.K. Hopke and D.L. Massart for making the olive oil data (of which example data set F is a subset), available in the public domain [20] .
Uncontrolled Keywords: canonical variate analysis-cva,classification,genetic algorithm-ga,non-linear,analytical chemistry,software,process chemistry and technology,spectroscopy,computer science applications ,/dk/atira/pure/subjectarea/asjc/1600/1602
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 06 Feb 2023 11:31
Last Modified: 06 Feb 2023 11:31
URI: https://ueaeprints.uea.ac.uk/id/eprint/91025
DOI: 10.1016/S0169-7439(00)00114-3

Actions (login required)

View Item View Item