Supervised and Unsupervised Adaptation to Regional Accented Speech using Limited Data for Automatic Speech Recognition

Najafian, Maryam, de Marco, Andrea, Cox, Stephen and Russell, Martin (2014) Supervised and Unsupervised Adaptation to Regional Accented Speech using Limited Data for Automatic Speech Recognition. In: Proceedings of Interspeech 2014. UNSPECIFIED, Singapore.

Full text not available from this repository. (Request a copy)

Abstract

This paper is concerned with automatic speech recognition (ASR) for accented speech. Given a small amount of speech from a new speaker, is it better to apply speaker adaptation to the baseline, or to use accent identification (AID) to identify the speaker’s accent and select an accent-dependent acoustic model? Three accent-based model selection methods are inves- tigated: using the ‘true’ accent model, and unsupervised model selection using i-Vector and phonotactic-based AID. All three methods outperform the unadapted baseline. Most significantly, AID-based model selection using 43s of speech performs bet- ter than unsupervised speaker adaptation, even if the latter uses five times more adaptation data. Combining unsupervised AID- based model selection and speaker adaptation gives an average relative reduction in ASR error rate of up to 47%.

Item Type: Book Section
Faculty \ School: Faculty of Science > School of Computing Sciences
Depositing User: Pure Connector
Date Deposited: 08 Oct 2014 08:48
Last Modified: 19 Jul 2020 23:54
URI: https://ueaeprints.uea.ac.uk/id/eprint/50433
DOI:

Actions (login required)

View Item View Item