Cox, Stephen (1995) Predictive Speaker Adaptation in Speech recognition. Computer Speech and Language, 9 (1). pp. 1-17.
Full text not available from this repository. (Request a copy)Abstract
A major problem with most speaker adaptation schemes is that they rely on the speaker providing at least one example of each acoustic unit (word, phone, triphone etc.) in the vocabulary in order to adapt the appropriate model. Rapid adaptation is difficult to achieve and some sounds may never be adapted because they are never heard. In this paper, a technique of adapting all the speech models to a new speaker's voice when he has given an incomplete set of the vocabulary is presented. The technique is based upon using the training-set to obtain estimates of correlations between sounds. Given some sounds from a new speaker at recognition time, these correlations are used to obtain estimates of unheard sounds which are used to adapt the speech models. The technique was applied to a database of 104 speakers speaking the English alphabet. When speakers spoke half of the vocabulary for enrollment prior to recognition, the technique gave a 78\% decrease in error.
Item Type: | Article |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Interactive Graphics and Audio Faculty of Science > Research Groups > Smart Emerging Technologies |
Depositing User: | Stephen Cox |
Date Deposited: | 17 Mar 2011 10:16 |
Last Modified: | 22 Apr 2023 23:56 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/26458 |
DOI: | 10.1006/csla.1995.0001 |
Actions (login required)
View Item |