Speech-Driven Conversational Agents using Conditional Flow-VAEs

Taylor, Sarah, Windle, Jonathan, Greenwood, David and Matthews, Iain (2021) Speech-Driven Conversational Agents using Conditional Flow-VAEs. In: CVMP '21: European Conference on Visual Media Production. UNSPECIFIED, pp. 1-9.

Full text not available from this repository. (Request a copy)


Automatic control of conversational agents has applications from animation, through human-computer interaction, to robotics. In interactive communication, an agent must move to express its own discourse, and also react naturally to incoming speech. In this paper we propose a Flow Variational Autoencoder (Flow-VAE) deep learning architecture for transforming conversational speech to body gesture, during both speaking and listening. The model uses a normalising flow to perform variational inference in an autoencoder framework and is a more expressive distribution than the Gaussian approximation of conventional variational autoencoders. Our model is non-deterministic, so can produce variations of plausible gestures for the same speech. Our evaluation demonstrates that our approach produces expressive body motion that is close to the ground truth using a fraction of the trainable parameters compared with previous state of the art.

Item Type: Book Section
Uncontrolled Keywords: speech animation,normalising flows,conversational agents,variational autoencoders
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 26 Jul 2022 14:30
Last Modified: 20 Jun 2023 15:01
URI: https://ueaeprints.uea.ac.uk/id/eprint/86886

Actions (login required)

View Item View Item