Analysis of Correlation between Audio and Visual Speech Features for Clean Audio Feature Prediction in Noise

Almajai, Ibrahim, Milner, Ben P. and Darch, Jonathan (2006) Analysis of Correlation between Audio and Visual Speech Features for Clean Audio Feature Prediction in Noise. In: ICSLP 9th International Conference on Spoken Language Processing, 2006-09-17 - 2006-09-21.

Full text not available from this repository. (Request a copy)

Abstract

The aim of this work is to examine the correlation between audio and visual speech features. The motivation is to find visual features that can provide clean audio feature estimates which can be used for speech enhancement when the original audio signal is corrupted by noise. Two audio features (MFCCs and formants) and three visual features (active appearance model, 2-D DCT and cross-DCT) are considered with correlation measured using multiple linear regression. The correlation is then exploited through the development of a maximum a posteriori (MAP) prediction of audio features solely from the visual features. Experiments reveal that features representing broad spectral information have higher correlation to visual features than those representing finer spectral detail. The accuracy of prediction follows the results found in the correlation measurements.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Faculty of Science > Research Groups > Data Science and AI
Depositing User: Vishal Gautam
Date Deposited: 20 May 2011 12:18
Last Modified: 10 Dec 2024 01:14
URI: https://ueaeprints.uea.ac.uk/id/eprint/22604
DOI:

Actions (login required)

View Item View Item