Visual speech synthesis using statistical models of shape and appearance

Theobald, B, Bangham, JA, Matthews, I and Cawley, GC ORCID: https://orcid.org/0000-0002-4118-9095 (2001) Visual speech synthesis using statistical models of shape and appearance. In: International Conference on Auditory-Visual Speech Processing (AVSP-2001), 2001-09-07 - 2001-09-09.

Full text not available from this repository. (Request a copy)

Abstract

In this paper we present preliminary results of work towards a video-realistic visual speech synthesizer based on statistical models of shape and appearance. A sequence of images corresponding to an utterance is formed by concatenation of synthesis units (in this case triphones) from a pre-recorded inventory. Initial work has concentrated on a compact representation of human faces, accommodating an extensive visual speech corpus without incurring excessive storage costs. The minimal set of control parameters of a combined appearance model is selected according to formal subjective testing. We also present two methods used to build statistical models that account for the perceptually important regions of the face.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences

UEA Research Groups: Faculty of Science > Research Groups > Computational Biology
Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Data Science and Statistics
Faculty of Science > Research Groups > Centre for Ocean and Atmospheric Sciences
Related URLs:
Depositing User: Vishal Gautam
Date Deposited: 25 Aug 2011 14:24
Last Modified: 21 Apr 2023 01:50
URI: https://ueaeprints.uea.ac.uk/id/eprint/21910
DOI:

Actions (login required)

View Item View Item