Najafian, Maryam, de Marco, Andrea, Cox, Stephen and Russell, Martin (2014) Supervised and Unsupervised Adaptation to Regional Accented Speech using Limited Data for Automatic Speech Recognition. In: Proceedings of Interspeech 2014. UNSPECIFIED, Singapore.
Full text not available from this repository. (Request a copy)Abstract
This paper is concerned with automatic speech recognition (ASR) for accented speech. Given a small amount of speech from a new speaker, is it better to apply speaker adaptation to the baseline, or to use accent identification (AID) to identify the speaker’s accent and select an accent-dependent acoustic model? Three accent-based model selection methods are inves- tigated: using the ‘true’ accent model, and unsupervised model selection using i-Vector and phonotactic-based AID. All three methods outperform the unadapted baseline. Most significantly, AID-based model selection using 43s of speech performs bet- ter than unsupervised speaker adaptation, even if the latter uses five times more adaptation data. Combining unsupervised AID- based model selection and speaker adaptation gives an average relative reduction in ASR error rate of up to 47%.
Item Type: | Book Section |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Interactive Graphics and Audio Faculty of Science > Research Groups > Smart Emerging Technologies |
Depositing User: | Pure Connector |
Date Deposited: | 08 Oct 2014 08:48 |
Last Modified: | 20 Jun 2023 14:57 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/50433 |
DOI: |
Actions (login required)
View Item |