Voicing classification of visual speech using convolutional neural networks

Le Cornu, Thomas; Milner, Ben

Voicing classification of visual speech using convolutional neural networks

Tools

Le Cornu, Thomas and Milner, Ben (2015) Voicing classification of visual speech using convolutional neural networks. In: FAAVSP - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing, 2015-09-11 - 2015-09-13, Austria.

Preview

PDF (faavsp_2015_paper) - Accepted Version
Download (765kB) | Preview

Abstract

The application of neural network and convolutional neural net- work (CNN) architectures is explored for the tasks of voicing classification (classifying frames as being either non-speech, unvoiced, or voiced) and voice activity detection (VAD) of vi- sual speech. Experiments are conducted for both speaker de- pendent and speaker independent scenarios. A Gaussian mixture model (GMM) baseline system is de- veloped using standard image-based two-dimensional discrete cosine transform (2D-DCT) visual speech features, achieving speaker dependent accuracies of 79% and 94%, for voicing classification and VAD respectively. Additionally, a single- layer neural network system trained using the same visual fea- tures achieves accuracies of 86 % and 97 %. A novel technique using convolutional neural networks for visual speech feature extraction and classification is presented. The voicing classifi- cation and VAD results using the system are further improved to 88 % and 98 % respectively. The speaker independent results show the neural network system to outperform both the GMM and CNN systems, achiev- ing accuracies of 63 % for voicing classification, and 79 % for voice activity detection.

Item Type:	Conference or Workshop Item (Paper)
Faculty \ School:	Faculty of Science > School of Computing Sciences Faculty of Science
UEA Research Groups:	Faculty of Science > Research Groups > Visual Computing and Signal Processing Faculty of Science > Research Groups > Smart Emerging Technologies (former - to 2025) Faculty of Science > Research Groups > Data Science and AI Faculty of Science > Research Groups > Cyber Intelligence and Networks
Depositing User:	Pure Connector
Date Deposited:	23 Dec 2015 13:00
Last Modified:	28 Feb 2025 16:33
URI:	https://ueaeprints.uea.ac.uk/id/eprint/55881
DOI:

Downloads

Downloads per month over past year

Actions (login required)

View Item