Classification techniques with minimal labelling effort and application to medical reports

Saad, Fathi H., Bell, G. Duncan and de la Iglesia, Beatriz ORCID: https://orcid.org/0000-0003-2675-5826 (2008) Classification techniques with minimal labelling effort and application to medical reports. International Journal of Data Mining and Bioinformatics, 2 (3). pp. 268-287. ISSN 1748-5673

Full text not available from this repository. (Request a copy)

Abstract

There are a number of approaches to classify text documents. Here, we use Partially Supervised Classification (PSC) and argue that it is an effective and efficient approach for real-world problems. PSC uses a two-step strategy to cut down on the labelling effort. There are a number of methods that have been proposed for each step. An evaluation of various methods is conducted using real-world medical documents. The results show that using EM to build the classifier yields better results than SVM. We also experimentally show that careful selection of a subset of features to represent the documents can improve performance.

Item Type: Article
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Medicine and Health Sciences > Research Centres > Business and Local Government Data Research Centre (former - to 2023)
Faculty of Science > Research Groups > Data Science and Statistics
Faculty of Science > Research Groups > Norwich Epidemiology Centre
Faculty of Medicine and Health Sciences > Research Groups > Norwich Epidemiology Centre
Depositing User: Vishal Gautam
Date Deposited: 10 Mar 2011 10:54
Last Modified: 11 Mar 2024 00:53
URI: https://ueaeprints.uea.ac.uk/id/eprint/22377
DOI: 10.1504/IJDMB.2008.022638

Actions (login required)

View Item View Item