Discovering Dynamic Visemes

Taylor, Sarah (2013) Discovering Dynamic Visemes. Doctoral thesis, University of East Anglia.

[thumbnail of finalversion-singlesided.pdf]
Download (36MB) | Preview


This thesis introduces a set of new, dynamic units of visual speech which are learnt
using computer vision and machine learning techniques. Rather than clustering
phoneme labels as is done traditionally, the visible articulators of a speaker are
tracked and automatically segmented into short, visually intuitive speech gestures
based on the dynamics of the articulators. The segmented gestures are clustered
into dynamic visemes, such that movements relating to the same visual function
appear within the same cluster. Speech animation can then be generated on any
facial model by mapping a phoneme sequence to a sequence of dynamic visemes,
and stitching together an example of each viseme in the sequence. Dynamic visemes
model coarticulation and maintain the dynamics of the original speech, so simple
blending at the concatenation boundaries ensures a smooth transition. The efficacy
of dynamic visemes for computer animation is formally evaluated both objectively
and subjectively, and compared with traditional phoneme to static lip-pose interpolation.

Item Type: Thesis (Doctoral)
Faculty \ School: Faculty of Science > School of Computing Sciences
Depositing User: Users 2259 not found.
Date Deposited: 05 Mar 2014 11:36
Last Modified: 02 May 2018 00:38


Downloads per month over past year

Actions (login required)

View Item View Item