A deep learning approach for generalized speech animation

Taylor, Sarah; Kim, Taehwan; Yue, Yisong; Mahler, Moshe; Krahe, James; Garcia Rodriguez, Anastasio; Hodgins, Jessica; Matthews, Iain

doi:10.1145/3072959.3073699

A deep learning approach for generalized speech animation

Tools

Taylor, Sarah, Kim, Taehwan, Yue, Yisong, Mahler, Moshe, Krahe, James, Garcia Rodriguez, Anastasio, Hodgins, Jessica and Matthews, Iain (2017) A deep learning approach for generalized speech animation. ACM Transactions on Graphics, 36 (4). ISSN 0730-0301

Preview

PDF (Accepted manuscript) - Accepted Version
Download (2MB) | Preview

Abstract

We introduce a simple and effective deep learning approach to automatically generate natural looking speech animation that synchronizes to input speech. Our approach uses a sliding window predictor that learns arbitrary nonlinear mappings from phoneme label input sequences to mouth movements in a way that accurately captures natural motion and visual coarticulation effects. Our deep learning approach enjoys several attractive properties: it runs in real-time, requires minimal parameter tuning, generalizes well to novel input speech sequences, is easily edited to create stylized and emotional speech, and is compatible with existing animation retargeting approaches. One important focus of our work is to develop an effective approach for speech animation that can be easily integrated into existing production pipelines. We provide a detailed description of our end-to-end approach, including machine learning design decisions. Generalized speech animation results are demonstrated over a wide range of animation clips on a variety of characters and voices, including singing and foreign language input. Our approach can also generate on-demand speech animation in real-time from user speech input.

Item Type:	Article
Faculty \ School:	Faculty of Science > School of Computing Sciences
UEA Research Groups:	Faculty of Science > Research Groups > Smart Emerging Technologies (former - to 2025) Faculty of Science > Research Groups > Visual Computing and Signal Processing (former - to 2025)
Depositing User:	Pure Connector
Date Deposited:	22 Sep 2017 05:07
Last Modified:	14 May 2026 11:32
URI:	https://ueaeprints.uea.ac.uk/id/eprint/64948
DOI:	10.1145/3072959.3073699

Downloads

Downloads per month over past year

Actions (login required)

View Item