Milner, B. P. and Shao, Xu (2003) Low bit-rate feature vector compression using transform coding and non-uniform bit allocation. In: IEEE International Conference on Acoustics Speech and Signal Processing, 2003-04-06 - 2003-04-10.
Full text not available from this repository. (Request a copy)Abstract
The paper presents a novel method for the low bit-rate compression of a feature vector stream with particular application to distributed speech recognition. The scheme operates by grouping feature vectors into non-overlapping blocks and applying a transformation to give a more compact matrix representation. Both Karhunen-Loeve and discrete cosine transforms are considered. Following transformation, higher-order columns of the matrix can be removed without loss in recognition performance. The number of bits allocated to the remaining elements in the matrix is determined automatically using a measure of their relative information content. Analysis of the amplitude distribution of the elements indicates that non-linear quantisation is more appropriate than linear quantisation. Comparative results, based on both spectral distortion and speech recognition accuracy, confirm this. Speech recognition tests using the ETSI Aurora database demonstrate that compression to bits rates of 2400 bps, 1200 bps and 800 bps has very little effect on recognition accuracy. For example at a bit rate of 1200 bps, recognition accuracy is 98.0% compared to 98.6% with no compression.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Interactive Graphics and Audio Faculty of Science > Research Groups > Smart Emerging Technologies |
Depositing User: | Vishal Gautam |
Date Deposited: | 04 Jul 2011 08:25 |
Last Modified: | 22 Apr 2023 02:47 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/22224 |
DOI: | 10.1109/ICASSP.2003.1202311 |
Actions (login required)
View Item |