A multi-variate time series clustering approach based on intermediate fusion: A case study in air pollution data imputation

Alahamade, Wedad, Lake, Iain ORCID: https://orcid.org/0000-0003-4407-5357, Reeves, Claire E. ORCID: https://orcid.org/0000-0003-4071-1926 and De La Iglesia, Beatriz ORCID: https://orcid.org/0000-0003-2675-5826 (2022) A multi-variate time series clustering approach based on intermediate fusion: A case study in air pollution data imputation. Neurocomputing, 490. pp. 229-245. ISSN 0925-2312

[thumbnail of Accepted_Manuscript]
Preview
PDF (Accepted_Manuscript) - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (9MB) | Preview

Abstract

Multivariate Time Series Clustering (MVTS) is an essential task, especially for large and complex dataset, but it has received limited attention in the literature. We are motivated by a real-world problem: the need to cluster air pollution data to produce plausible imputations for missing measurements for some pollutants. Our main focus will be on the UK air quality assessments, the study uses data collected from automatic monitoring stations during four-year period (2015–2018). In this work, we propose a MVTS clustering method followed by an imputation methods for the whole Time Series (TS). We compare two approaches to cluster the stations: univariate TS clustering using Shape-Based Distance (SBD) for individual pollutants, and MVTS clustering using the fused similarity that combines the SBD for all the pollutants. We run a k-means algorithm to produce clusters with each approach on the same dataset. Our analysis shows that using MVTS clustering produces the best clusters as measured by various quality indexes and by the imputations they help to reduce the error average between imputed and real values based on the Root Mean Squared Error (RMSE) and its standard deviation (Std).

Item Type: Article
Faculty \ School: Faculty of Science > School of Computing Sciences
Faculty of Science > School of Environmental Sciences
University of East Anglia Research Groups/Centres > Theme - ClimateUEA
UEA Research Groups: Faculty of Science > Research Groups > Environmental Social Sciences
University of East Anglia Schools > Faculty of Science > Tyndall Centre for Climate Change Research
Faculty of Science > Research Centres > Tyndall Centre for Climate Change Research
Faculty of Science > Research Groups > Centre for Ocean and Atmospheric Sciences
Faculty of Medicine and Health Sciences > Research Centres > Norwich Institute for Healthy Aging
Faculty of Medicine and Health Sciences > Research Centres > Business and Local Government Data Research Centre
Faculty of Science > Research Groups > Data Science and Statistics
Faculty of Science > Research Groups > Norwich Epidemiology Centre
Faculty of Medicine and Health Sciences > Research Groups > Norwich Epidemiology Centre
Depositing User: LivePure Connector
Date Deposited: 08 Oct 2021 00:56
Last Modified: 20 Apr 2023 22:37
URI: https://ueaeprints.uea.ac.uk/id/eprint/81608
DOI: 10.1016/j.neucom.2021.09.079

Actions (login required)

View Item View Item