On Estimation of a Speaker's Confusion Matrix from Sparse Data

Cox, Stephen (2008) On Estimation of a Speaker's Confusion Matrix from Sparse Data. In: 9th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2008-09-22 - 2008-09-26.

Full text not available from this repository. (Request a copy)


Confusion matrices have been widely used to increase the accuracy of speech recognisers, but usually a mean confusion matrix, averaged over many speakers, is used. However, analysis shows that confusion matrices for individual speakers vary considerably, and so there is benefit in obtaining estimates of confusion matrices for individual speakers. Unfortunately, there is rarely enough data to make reliable estimates. We present a technique for estimating the elements of a speaker's confusion matrix given only sparse data from the speaker. It utilizes non-negative matrix factorisation to find structure within confusion matrices, and this structure is exploited to make improved estimates. Results show that under certain conditions, this technique can give estimates that are as good as those obtained with twice the number of utterances available from the speaker.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Depositing User: Nicola Talbot
Date Deposited: 14 Mar 2011 08:52
Last Modified: 22 Apr 2023 02:44
URI: https://ueaeprints.uea.ac.uk/id/eprint/26019

Actions (login required)

View Item View Item