Impact of Feature Selection on Non-technical Loss Detection

Ghori, Khawaja MoyeezUllah, Abbasi, Rabeeh Ayaz, Awais, Muhammad ORCID: https://orcid.org/0000-0001-6421-9245, Imran, Muhammad, Ullah, Atta and Szathmary, Laszlo (2020) Impact of Feature Selection on Non-technical Loss Detection. In: Proceedings - 2020 6th Conference on Data Science and Machine Learning Applications, CDMA 2020. The Institute of Electrical and Electronics Engineers (IEEE), pp. 19-24. ISBN 9781728127460

Full text not available from this repository. (Request a copy)

Abstract

Over the years, many countries have faced huge financial deficits due to Non-Technical Loss (NTL) in power sector. There are many ways of attempting to illegal use of electricity like by-passing and reversing meters. There have been many attempts to bring down NTL using manual and automated techniques. Manual NTL detection is not proving fruitful as it incurs heavy costs and has a low hit ratio. Due to the shortcoming of manual NTL detection, automated detection of NTL using machine learning classifiers is gaining attention in the research community. The datasets containing NTL belong to the class imbalance domain where regular consumers (negative class) out weight the representation of irregular consumers (positive class). To identify the right number of representative records, many techniques are proposed but selecting the right features in deciding NTL is equally an important task where not much has been contributed to the literature. In this paper, we propose the Incremental Feature Selection (IFS) algorithm which first uses feature importance to identify the most relevant features for NTL detection and then these features are used to test three classifiers namely CatBoost, Decision Tree (DT) Classifier and K-Nearest Neighbors (KNN) for NTL detection. This way, we have not only identified the most relevant features for NTL detection in a real dataset but also have brought down the overall computation time of the classifiers. Moreover, our proposed framework is tested on three performance evaluation metrics used in imbalance domain. The results show that using the most relevant features identified by the IFS algorithm, the three classifiers have the same or slightly better efficiency as compared to using all features.

Item Type: Book Section
Uncontrolled Keywords: artificial intelligence,computer science applications,information systems,information systems and management ,/dk/atira/pure/subjectarea/asjc/1700/1702
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Data Science and AI
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 17 Oct 2023 00:49
Last Modified: 10 Dec 2024 01:13
URI: https://ueaeprints.uea.ac.uk/id/eprint/93333
DOI: 10.1109/cdma47397.2020.00009

Actions (login required)

View Item View Item