Transform-based feature vector compression for distributed speech recognition

Milner, Ben P. and Shao, Xu (2002) Transform-based feature vector compression for distributed speech recognition. In: 7th International Conference on Spoken Language Processing (ICSLP-2002), 2002-09-16 - 2002-09-20.

Full text not available from this repository. (Request a copy)


The technique of distributed speech recognition (DSR) has recently become an interesting area of research. One of the main issues with DSR is the need to compress the feature vector stream, produced on the terminal device, into a sufficiently low bit-rate such that it can be sent across low bandwidth channels. This work proposes a compression technique based upon first transforming a block of feature vectors into a more compact matrix representation. Columns of the resulting matrix that correspond to faster temporal variation can be removed without loss in recognition performance. The number of bits allocated to the remaining coefficients in the matrix is determined automatically, based on a measure of the information present. Experiments show that the transform-based compression gives good recognition accuracy at bit rates of 4800, 2400 and 1200bps. For example at 1200bps the recognition performance is 98.03% compared to 98.57% with uncompressed speech.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Depositing User: Vishal Gautam
Date Deposited: 28 Jul 2011 10:13
Last Modified: 22 Apr 2023 02:47

Actions (login required)

View Item View Item