Improving Phoneme Recognition of Telephone Quality Speech

Huang, Q. and Cox, S. J. (2004) Improving Phoneme Recognition of Telephone Quality Speech. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2004-05-17 - 2004-05-21.

Full text not available from this repository. (Request a copy)


There are some speech understanding applications in which training transcriptions are unavailable, and hence the vocabulary is unknown, but the task is to recognise key words and phrases within an utterance rather than to attempt a complete, accurate transcription. An example of such a task is call-routing, when transcriptions of training utterances (which are very expensive to produce) are unavailable. In such cases, phoneme rather than word recognition is appropriate. However, phoneme recognition of spontaneous speech spoken by a large multi-accent population over telephone connections is very inaccurate. To improve accuracy, we describe a technique in which we segment the waveform into subword-like units and use clustering and an iteratively refined language model to correct the errors in the recognised phonemes. The method was shown to work well on telephone quality spontaneous speech, raising the phoneme accuracy from 28.1% after the first iteration to 47.3% after three iterations.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Depositing User: Vishal Gautam
Date Deposited: 14 Jun 2011 17:16
Last Modified: 22 Apr 2023 02:46
DOI: 10.1109/ICASSP.2004.1326018

Actions (login required)

View Item View Item