Milner, Ben and Almajai, Ibrahim (2007) Noisy audio speech enhancement using Wiener filters derived from visual speech. In: Auditory-Visual Speech Processing 2007 (AVSP2007), 2007-08-31 - 2007-09-03, Kasteel Groenendaal.
Full text not available from this repository. (Request a copy)Abstract
The aim of this paper is to use visual speech information to create Wiener filters for audio speech enhancement. Wiener filters require estimates of both clean speech statistics and noisy speech statistics. Noisy speech statistics are obtained from the noisy input audio while obtaining clean speech statistics is more difficult and is a major problem in the creation of Wiener filters for speech enhancement. In this work the clean speech statistics are estimated from frames of visual speech that are extracted in synchrony with the audio. The estimation procedure begins by modelling the joint density of clean audio and visual speech features using a Gaussian mixture model (GMM). Using the GMM and an input visual speech vector a maximum a posterior (MAP) estimate of the audio feature is made. The effectiveness of speech enhancement using the visually-derived Wiener filter has been compared to a conventional audio-based Wiener filter implementation using a perceptual evaluation of speech quality (PESQ) analysis. PESQ scores in train noise at different signal-to-noise ratios (SNRs) show that the visuallyderived Wiener filter significantly outperforms the audio- Wiener filter at lower SNRs.
Item Type: | Conference or Workshop Item (Other) |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Smart Emerging Technologies Faculty of Science > Research Groups > Interactive Graphics and Audio |
Related URLs: | |
Depositing User: | EPrints Services |
Date Deposited: | 01 Oct 2010 13:41 |
Last Modified: | 20 Jun 2023 14:32 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/3091 |
DOI: |
Actions (login required)
View Item |