Bagnall, AJ, Janacek, GJ and Zhang, M (2003) Clustering Time Series from Mixture Polynomial Models with Discretised Data. Working Paper. University of East Anglia.
Preview |
PDF (CMP-C03-17.pdf)
Download (323kB) | Preview |
Abstract
Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low.
Item Type: | Monograph (Working Paper) |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Data Science and AI |
Depositing User: | Vishal Gautam |
Date Deposited: | 22 Jul 2011 11:31 |
Last Modified: | 21 Jul 2024 01:33 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/22537 |
DOI: |
Downloads
Downloads per month over past year
Actions (login required)
View Item |