Formant Frequency Prediction from MFCC Vectors in Noisy Environments

Darch, Jonathan, Milner, Ben P. and Vaseghi, Saeed V. (2005) Formant Frequency Prediction from MFCC Vectors in Noisy Environments. In: 9th European Conference on Speech Communication and Technology, 2005-09-04 - 2005-09-08.

Full text not available from this repository. (Request a copy)

Abstract

This paper proposes a method of predicting the formant frequencies of a frame of speech from its mel-frequency cepstral coefficient (MFCC) representation. Prediction is achieved through the creation of a Gaussian mixture model (GMM) which models the joint density of formant frequencies and MFCCs. Using this GMM and an input MFCC vector, a maximum a posteriori (MAP) prediction of the formant frequencies is generated. Formant prediction accuracy is evaluated on both a constrained vocabulary connected digits database and on a 5000 word large vocabulary database. Experiments first examine the accuracy of formant frequency prediction as the number of clusters in the GMM is varied with a best formant frequency prediction error of 3.72% being obtained. Secondly the effect of noise on formant prediction accuracy is examined. A fall in accuracy is observed with reducing signal-to-noise ratios, but by using a GMM matched to the noise conditions formant prediction accuracy is significantly improved.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Faculty of Science > Research Groups > Data Science and AI
Depositing User: Vishal Gautam
Date Deposited: 14 Jul 2011 07:28
Last Modified: 10 Dec 2024 01:15
URI: https://ueaeprints.uea.ac.uk/id/eprint/22471
DOI:

Actions (login required)

View Item View Item