Thangthai, Ausdang, Milner, Ben and Taylor, Sarah (2019) Synthesising visual speech using dynamic visemes and deep learning architectures. Computer Speech and Language, 55. pp. 101-119. ISSN 0885-2308
Preview |
PDF (AAM_Thangthai_etal)
- Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (3MB) | Preview |
Abstract
This paper proposes and compares a range of methods to improve the naturalness of visual speech synthesis. A feedforward deep neural network (DNN) and many-to-one and many-to-many recurrent neural networks (RNNs) using long short-term memory (LSTM) are considered. Rather than using acoustically derived units of speech, such as phonemes, viseme representations are considered and we propose using dynamic visemes together with a deep learning framework. The input feature representation to the models is also investigated and we determine that including wide phoneme and viseme contexts is crucial for predicting realistic lip motions that are sufficiently smooth but not under-articulated. A detailed objective evaluation across a range of system configurations shows that a combined dynamic viseme-phoneme speech unit combined with a many-to-many encoder-decoder architecture models visual co-articulations effectively. Subjective preference tests reveal there to be no significant difference between animations produced using this system and using ground truth facial motion taken from the original video. Furthermore, the dynamic viseme system also outperforms significantly conventional phoneme-driven speech animation systems.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | deep neural network,dynamic visemes,talking head,visual speech synthesis,software,theoretical computer science,human-computer interaction ,/dk/atira/pure/subjectarea/asjc/1700/1712 |
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Interactive Graphics and Audio Faculty of Science > Research Groups > Smart Emerging Technologies |
Related URLs: | |
Depositing User: | LivePure Connector |
Date Deposited: | 21 Nov 2018 13:30 |
Last Modified: | 20 Apr 2023 04:33 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/68990 |
DOI: | 10.1016/j.csl.2018.11.003 |
Downloads
Downloads per month over past year
Actions (login required)
View Item |