Predictive Speaker Adaptation in Speech recognition

Cox, Stephen (1995) Predictive Speaker Adaptation in Speech recognition. Computer Speech and Language, 9 (1). pp. 1-17.

Full text not available from this repository. (Request a copy)


A major problem with most speaker adaptation schemes is that they rely on the speaker providing at least one example of each acoustic unit (word, phone, triphone etc.) in the vocabulary in order to adapt the appropriate model. Rapid adaptation is difficult to achieve and some sounds may never be adapted because they are never heard. In this paper, a technique of adapting all the speech models to a new speaker's voice when he has given an incomplete set of the vocabulary is presented. The technique is based upon using the training-set to obtain estimates of correlations between sounds. Given some sounds from a new speaker at recognition time, these correlations are used to obtain estimates of unheard sounds which are used to adapt the speech models. The technique was applied to a database of 104 speakers speaking the English alphabet. When speakers spoke half of the vocabulary for enrollment prior to recognition, the technique gave a 78\% decrease in error.

Item Type: Article
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Depositing User: Stephen Cox
Date Deposited: 17 Mar 2011 10:16
Last Modified: 22 Apr 2023 23:56
DOI: 10.1006/csla.1995.0001

Actions (login required)

View Item View Item