Theobald, Barry-John and Matthews, Iain (2012) Relating objective and subjective performance measures for AAM-based visual speech synthesizers. IEEE Transactions on Audio, Speech and Language Processing, 20 (8). pp. 2378-2387.
Preview |
PDF (itaslp.pdf)
- Published Version
Download (437kB) | Preview |
Abstract
We compare two approaches for synthesizing visual speech using Active Appearance Models (AAMs): one that utilizes acoustic features as input, and one that utilizes a phonetic transcription as input. Both synthesizers are trained using the same data and the performance is measured using both objective and subjective testing. We investigate the impact of likely sources of error in the synthesized visual speech by introducing typical errors into real visual speech sequences and subjectively measuring the perceived degradation. When only a small region (e.g. a single syllable) of ground-truth visual speech is incorrect we find that the subjective score for the entire sequence is subjectively lower than sequences generated by our synthesizers. This observation motivates further consideration of an often ignored issue, which is to what extent are subjective measures correlated with objective measures of performance? Significantly, we find that the most commonly used objective measures of performance are not necessarily the best indicator of viewer perception of quality. We empirically evaluate alternatives and show that the cost of a dynamic time warp of synthesized visual speech parameters to the respective ground-truth parameters is a better indicator of subjective quality.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | visual speech synthesis,visual speech,active appearance models,evaluation |
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Interactive Graphics and Audio |
Depositing User: | Barry-John Theobald |
Date Deposited: | 08 May 2013 08:34 |
Last Modified: | 24 Jan 2024 01:19 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/41152 |
DOI: | 10.1109/TASL.2012.2202651 |
Downloads
Downloads per month over past year
Actions (login required)
View Item |