Clustering Time Series from Mixture Polynomial Models with Discretised Data

Bagnall, AJ, Janacek, GJ and Zhang, M (2003) Clustering Time Series from Mixture Polynomial Models with Discretised Data. Working Paper. University of East Anglia.

[thumbnail of CMP-C03-17.pdf]
Preview
PDF (CMP-C03-17.pdf)
Download (323kB) | Preview

Abstract

Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low.

Item Type: Monograph (Working Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Data Science and Statistics
Depositing User: Vishal Gautam
Date Deposited: 22 Jul 2011 11:31
Last Modified: 13 Jul 2022 00:01
URI: https://ueaeprints.uea.ac.uk/id/eprint/22537
DOI:

Actions (login required)

View Item View Item