Theobald, B, Bangham, JA, Matthews, I and Cawley, GC (2001) Visual speech synthesis using statistical models of shape and appearance. In: International Conference on Auditory-Visual Speech Processing (AVSP-2001), 2001-09-07 - 2001-09-09.
Full text not available from this repository. (Request a copy)Abstract
In this paper we present preliminary results of work towards a video-realistic visual speech synthesizer based on statistical models of shape and appearance. A sequence of images corresponding to an utterance is formed by concatenation of synthesis units (in this case triphones) from a pre-recorded inventory. Initial work has concentrated on a compact representation of human faces, accommodating an extensive visual speech corpus without incurring excessive storage costs. The minimal set of control parameters of a combined appearance model is selected according to formal subjective testing. We also present two methods used to build statistical models that account for the perceptually important regions of the face.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Faculty \ School: | Faculty of Science > School of Computing Sciences |
| UEA Research Groups: | Faculty of Science > Research Groups > Computational Biology Faculty of Science > Research Groups > Visual Computing and Signal Processing Faculty of Science > Research Groups > Data Science and AI Faculty of Science > Research Groups > Centre for Ocean and Atmospheric Sciences |
| Related URLs: | |
| Depositing User: | Vishal Gautam |
| Date Deposited: | 25 Aug 2011 14:24 |
| Last Modified: | 29 Jan 2025 00:03 |
| URI: | https://ueaeprints.uea.ac.uk/id/eprint/21910 |
| DOI: |
Actions (login required)
![]() |
View Item |
Tools
Tools