Predictive Speaker Adaptation in Speech recognition

Cox, Stephen (1995) Predictive Speaker Adaptation in Speech recognition. Computer Speech and Language, 9 (1). pp. 1-17.

[img]
Preview
PDF (Cox-CSL-95.pdf)
Download (180kB) | Preview

    Abstract

    A major problem with most speaker adaptation schemes is that they rely on the speaker providing at least one example of each acoustic unit (word, phone, triphone etc.) in the vocabulary in order to adapt the appropriate model. Rapid adaptation is difficult to achieve and some sounds may never be adapted because they are never heard. In this paper, a technique of adapting all the speech models to a new speaker's voice when he has given an incomplete set of the vocabulary is presented. The technique is based upon using the training-set to obtain estimates of correlations between sounds. Given some sounds from a new speaker at recognition time, these correlations are used to obtain estimates of unheard sounds which are used to adapt the speech models. The technique was applied to a database of 104 speakers speaking the English alphabet. When speakers spoke half of the vocabulary for enrollment prior to recognition, the technique gave a 78\% decrease in error.

    Item Type: Article
    Faculty \ School: Faculty of Science > School of Computing Sciences
    ?? RGGVS ??
    Depositing User: Stephen Cox
    Date Deposited: 17 Mar 2011 10:16
    Last Modified: 25 Jul 2018 08:12
    URI: https://ueaeprints.uea.ac.uk/id/eprint/26458
    DOI: 10.1006/csla.1995.0001

    Actions (login required)

    View Item