Multiple Imputation Ensembles for Time Series (MIE-TS)

Aleryani, Aliya, Bostrom, Aaron ORCID: https://orcid.org/0000-0002-7300-6038, Wang, Wenjia and Iglesia, Beatriz ORCID: https://orcid.org/0000-0003-2675-5826 (2023) Multiple Imputation Ensembles for Time Series (MIE-TS). In: ACM Transactions on Knowledge Discovery from Data. ACM Transactions on Knowledge Discovery from Data, 17 (3). UNSPECIFIED, pp. 1-28.

[thumbnail of 3551643]
Preview
PDF (3551643) - Published Version
Download (3MB) | Preview

Abstract

Time series classification has become an interesting field of research, thanks to the extensive studies conducted in the past two decades. Time series may have missing data, which may affect both the representation and also modeling of time series. Thus, recovering missing data using appropriate time series-based imputation methods is an essential step. Multiple imputation is a data recovery method where it produced multiple imputed data. The method proves its usefulness in terms of reflecting the uncertainty inherit in missing data; however, it is under-researched in time series problems. In this article, we propose two multiple imputation approaches for time series. The first is a multiple imputation method based on interpolation. The second is a multiple imputation and ensemble method. First, we simulate missing consecutive sub-sequences under a Missing Completely at Random mechanism; then, we use single/multiple imputation methods. The imputed data are used to build bagging and stacking ensembles. We build ensembles using standard classification algorithms as well as time series classifiers. The standard classifiers involve Random Forest, Support Vector Machines, K-Nearest Neighbour, C4.5, and PART while TSCHIEF, Proximity Forest, Time Series Forest, RISE, and BOSS are chosen as time series classifiers. Our findings show that the combination of multiple imputation and ensemble improves the performance of the majority of classifiers tested in this study, often above the performance obtained from the complete data, even under increasing missing data scenarios. This may be because the diversity injected by multiple imputation has a very favourable and stabilising effect on the classifier performance, which is a very important finding.

Item Type: Book Section
Additional Information: Funding Information: We acknowledge support from Grant Number ES/L011859/1, from The Business and Local Government Data Research Centre, funded by the Economic and Social Research Council to provide economic, scientific and social researchers and business analysts with secure data services. Publisher Copyright: © 2023 Association for Computing Machinery.
Uncontrolled Keywords: ensemble methods,missing data,multiple imputation,time series,computer science(all) ,/dk/atira/pure/subjectarea/asjc/1700
Faculty \ School: Faculty of Science > School of Computing Sciences
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 21 Jan 2025 00:36
Last Modified: 21 Jan 2025 00:36
URI: https://ueaeprints.uea.ac.uk/id/eprint/98272
DOI: 10.1145/3551643

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item