Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model

Milner, Ben P.; Shao, Xu

Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model

Tools

Milner, Ben P. and Shao, Xu (2002) Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model. In: 7th International Conference on Spoken Language Processing (ICSLP-2002), 2002-09-16 - 2002-09-20.

Full text not available from this repository. (Request a copy)

Abstract

This work presents a method of reconstructing a speech signal from a stream of MFCC vectors using a source-filter model of speech production. The MFCC vectors are used to provide an estimate of the vocal tract filter. This is achieved by inverting the MFCC vector back to a smoothed estimate of the magnitude spectrum. The Wiener- Khintchine theorem and linear predictive analysis transform this into an estimate of the vocal tract filter coefficients. The excitation signal is produced from a series of pitch pulses or white noise, depending on whether the speech is voiced or unvoiced. This pitch estimate forms an extra element of the feature vector. Listening tests reveal that the reconstructed speech is intelligible and of similar quality to a system based on LPC analysis of the original speech. Spectrograms of the MFCC-derived speech and the real speech are included which confirm the similarity.

Item Type:	Conference or Workshop Item (Paper)
Faculty \ School:	Faculty of Science > School of Computing Sciences
UEA Research Groups:	Faculty of Science > Research Groups > Visual Computing and Signal Processing Faculty of Science > Research Groups > Smart Emerging Technologies (former - to 2025) Faculty of Science > Research Groups > Data Science and AI Faculty of Science > Research Groups > Cyber Intelligence and Networks
Depositing User:	Vishal Gautam
Date Deposited:	28 Jul 2011 10:11
Last Modified:	28 Feb 2025 16:32
URI:	https://ueaeprints.uea.ac.uk/id/eprint/22229
DOI:

Actions (login required)

View Item