A Real-Time Speech-Driven Talking Head using Active Appearance Models

Theobald, Barry-John and Wilkinson, Nicholas (2007) A Real-Time Speech-Driven Talking Head using Active Appearance Models. In: Proceedings of the International Conference on Audio-visual Speech Processing (AVSP), 2007-08-31 - 2007-09-03.

Full text not available from this repository. (Request a copy)


In this paper we describe a real-time speech-driven method for synthesising realistic video sequences of a subject enunciating arbitrary phrases. In an offline training phase an active appearance model (AAM) is constructed from hand-labelled images and is used to encode the face of a subject reciting a few training sentences. Canonical correlation analysis (CCA) coupled with linear regression is then used to model the relationship between auditory and visual features, which is later used to predict visual features from the auditory features for novel utterances. We present results from experiments conducted: 1) to determine the suitability of several auditory features for use in an AAM-based speech-driven talking head, 2) to determine the effect of the size of the training set on the correlation between the auditory and visual features, 3) to determine the influence of context on the degree of correlation, and 4) to determine the appropriate window size from which the auditory features should be calculated. This approach shows promise and a longer term goal is to develop a fully expressive, three-dimensional talking head.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
Related URLs:
Depositing User: Vishal Gautam
Date Deposited: 19 May 2011 07:58
Last Modified: 20 Aug 2021 23:41
URI: https://ueaeprints.uea.ac.uk/id/eprint/21912

Actions (login required)

View Item View Item