Darch, Jonathan, Milner, Ben P. and Vaseghi, Saeed (2006) MAP prediction of formant frequencies and voicing class from MFCC vectors in noise. Speech Communication, 48 (11). pp. 1556-1572. ISSN 0167-6393
Full text not available from this repository. (Request a copy)Abstract
Novel methods are presented for predicting formant frequencies and voicing class from mel-frequency cepstral coefficients (MFCCs). It is shown how Gaussian mixture models (GMMs) can be used to model the relationship between formant frequencies and MFCCs. Using such models and an input MFCC vector, a maximum a posteriori (MAP) prediction of formant frequencies can be made. The specific relationship each speech sound has between MFCCs and formant frequencies is exploited by using state-specific GMMs within a framework of a set of hidden Markov models (HMMs). Formant prediction accuracy and voicing prediction of speaker-independent male speech are evaluated on both a constrained vocabulary connected digits database and a large vocabulary database. Experimental results show that for HMM–GMM prediction on the connected digits database, voicing class prediction error is less than 3.5%. Less than 1.8% of frames have formant frequency percentage errors greater than 20% and the mean percentage error of the remaining frames is less than 3.7%. Further experiments show prediction accuracy under noisy conditions. For example, at a signal-to-noise ratio (SNR) of 0 dB, voicing class prediction error increases to 9.4%, less than 4.3% of frames have formant frequency percentage errors over 20% and the formant frequency percentage error for the remaining frames is less than 5.7%.
Item Type: | Article |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Interactive Graphics and Audio Faculty of Science > Research Groups > Smart Emerging Technologies |
Depositing User: | Vishal Gautam |
Date Deposited: | 23 May 2011 07:34 |
Last Modified: | 22 Apr 2023 01:23 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/22051 |
DOI: | 10.1016/j.specom.2006.06.001 |
Actions (login required)
View Item |