Speech-Driven Conversational Agents using Conditional Flow-VAEs

Taylor, Sarah; Windle, Jonathan; Greenwood, David; Matthews, Iain

Speech-Driven Conversational Agents using Conditional Flow-VAEs

Tools

Taylor, Sarah, Windle, Jonathan, Greenwood, David and Matthews, Iain (2021) Speech-Driven Conversational Agents using Conditional Flow-VAEs. In: CVMP '21: European Conference on Visual Media Production. UNSPECIFIED, pp. 1-9.

Full text not available from this repository. (Request a copy)

Abstract

Automatic control of conversational agents has applications from animation, through human-computer interaction, to robotics. In interactive communication, an agent must move to express its own discourse, and also react naturally to incoming speech. In this paper we propose a Flow Variational Autoencoder (Flow-VAE) deep learning architecture for transforming conversational speech to body gesture, during both speaking and listening. The model uses a normalising flow to perform variational inference in an autoencoder framework and is a more expressive distribution than the Gaussian approximation of conventional variational autoencoders. Our model is non-deterministic, so can produce variations of plausible gestures for the same speech. Our evaluation demonstrates that our approach produces expressive body motion that is close to the ground truth using a fraction of the trainable parameters compared with previous state of the art.

Item Type:	Book Section
Uncontrolled Keywords:	speech animation,normalising flows,conversational agents,variational autoencoders
Faculty \ School:	Faculty of Science > School of Computing Sciences
UEA Research Groups:	Faculty of Science > Research Groups > Visual Computing and Signal Processing
Related URLs:	https://dl.acm.org/doi/10.1145/3485441.3...
Depositing User:	LivePure Connector
Date Deposited:	26 Jul 2022 14:30
Last Modified:	25 Sep 2024 10:44
URI:	https://ueaeprints.uea.ac.uk/id/eprint/86886
DOI:

Actions (login required)

View Item