Darch, J. and Milner, B.P. (2007) A Comparision of Estimated and MAP-Predicted Formants and Fundamental Frequencies with Speech Reconstruction Application. In: 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), 2007-08-27 - 2007-08-31.
Full text not available from this repository. (Request a copy)Abstract
This work compares the accuracy of fundamental frequency and formant frequency estimation methods and maximum a posteriori (MAP) prediction from MFCC vectors with hand-corrected references. Five fundamental frequency estimation methods are compared to fundamental frequency prediction from MFCC vectors in both clean and noisy speech. Similarly, three formant frequency estimation and prediction methods are compared. An analysis of estimation and prediction accuracy shows that prediction from MFCCs provides the most accurate voicing classification across clean and noisy speech. On clean speech, fundamental frequency estimation outperforms prediction from MFCCs, but as noise increases the performance of prediction is significantly more robust than estimation. Formant frequency prediction is found to be more accurate than estimation in both clean and noisy speech. A subjective analysis of the estimation and prediction methods is also made by reconstructing speech from the acoustic features.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
UEA Research Groups: | Faculty of Science > Research Groups > Interactive Graphics and Audio Faculty of Science > Research Groups > Smart Emerging Technologies Faculty of Science > Research Groups > Data Science and AI |
Depositing User: | Vishal Gautam |
Date Deposited: | 05 Apr 2011 07:28 |
Last Modified: | 10 Dec 2024 01:14 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/22470 |
DOI: |
Actions (login required)
View Item |