Speaker Normalisation in the MFCC Domain

Cox, Stephen (2000) Speaker Normalisation in the MFCC Domain. In: 6th International Conference on Spoken Language Processing, 2000-10-16 - 2000-10-20.

Full text not available from this repository. (Request a copy)


It has been shown in several recent publications that application of vocal tract normalization (VTN) is a successful method for improving the accuracy of speaker independent recognisers. We argue that VTN can be implemented in the filterbank domain and propose a model to achieve this. We show how the model can be implemented directly in the MFCC domain, where it may be viewed as a constrained version of maximum likelihood linear regression (MLLR). The parameter estimates produced by the model are in accord with our ideas about how it should operate to perform VTN. Recognition results on a phoneme recognition task are presented which show a small improvement in accuracy.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Depositing User: Vishal Gautam
Date Deposited: 25 Aug 2011 15:46
Last Modified: 20 Jun 2023 14:35
URI: https://ueaeprints.uea.ac.uk/id/eprint/21751

Actions (login required)

View Item View Item