Theobald, Barry-John and Wilkinson, Nicholas (2007) A Real-Time Speech-Driven Talking Head using Active Appearance Models. In: International Conference on Audio-Visual Speech Processing (AVSP), 2007-08-31 - 2007-09-03, Kasteel Groenendaal.
Full text not available from this repository.Abstract
In this paper we describe a real-time speech-driven method for synthesising realistic video sequences of a subject enunciating arbitrary phrases. In an offline training phase an active appearance model (AAM) is constructed from hand-labelled images and is used to encode the face of a subject reciting a few training sentences. Canonical correlation analysis (CCA) coupled with linear regression is then used to model the relationship between auditory and visual features, which is later used to predict visual features from the auditory features for novel utterances. We present results from experiments conducted: 1) to determine the suitability of several auditory features for use in an AAM-based speech-driven talking head, 2) to determine the effect of the size of the training set on the correlation between the auditory and visual features, 3) to determine the influence of context on the degree of correlation, and 4) to determine the appropriate window size from which the auditory features should be calculated. This approach shows promise and a longer term goal is to develop a fully expressive, three-dimensional talking head.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Interactive Graphics and Audio |
Related URLs: | |
Depositing User: | Vishal Gautam |
Date Deposited: | 19 May 2011 07:58 |
Last Modified: | 20 Jun 2023 14:32 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/21912 |
DOI: |
Actions (login required)
View Item |