The Effect of Real-Time Constraints on Automatic Speech Animation

Websdale, Danny, Taylor, Sarah and Milner, Ben (2018) The Effect of Real-Time Constraints on Automatic Speech Animation. In: Proceedings of Interspeech 2018. UNSPECIFIED, pp. 2479-2483.

[thumbnail of 2066]
Preview
PDF (2066) - Published Version
Download (1MB) | Preview

Abstract

Machine learning has previously been applied successfully to speech-driven facial animation. To account for carry-over and anticipatory coarticulation a common approach is to predict the facial pose using a symmetric window of acoustic speech that includes both past and future context. Using future context limits this approach for animating the faces of characters in real-time and networked applications, such as online gaming. An acceptable latency for conversational speech is 200ms and typically network transmission times will consume a significant part of this. Consequently, we consider asymmetric windows by investigating the extent to which decreasing the future context effects the quality of predicted animation using both deep neural networks (DNNs) and bi-directional LSTM recurrent neural networks (BiLSTMs). Specifically we investigate future contexts from 170ms (fully-symmetric) to 0ms (fullyasymmetric …

Item Type: Book Section
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Faculty of Science > Research Groups > Data Science and AI
Depositing User: LivePure Connector
Date Deposited: 12 Aug 2019 11:30
Last Modified: 10 Dec 2024 01:11
URI: https://ueaeprints.uea.ac.uk/id/eprint/71940
DOI: 10.21437/Interspeech.2018-2066

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item