A Frequency Domain Approach to ARX-LF Voiced Speech Parameterization and Synthesis
نویسندگان
چکیده
The ARX-LF model interprets voiced speech as the an LF derivative glottal pulse exciting an all-pole vocal tract filter with an additional exogenous residual signal. It fully parameterizes the voice and has been shown to be useful for voice modification. Because time domain methods to determine the ARX-LF parameters from speech are very sensitive to the time placement of the analysis frame and not robust to phase distortion from e.g. recording equipment, a magnitude-only spectral approach to ARX-LF parameterization was recently developed. This paper describes extensions to this frequency domain approach to obtain continuous robust ARX-LF parameters for voiced speech segments. A listening test of 50 participants comparing synthetic speech produced by this method with a time domain ARX-LF parameterization approach under real and simulated recording conditions was conducted and it was found that the frequency domain approach was generally preferred.
منابع مشابه
Towards flexible speech coding for speech synthesis: an LF + modulated noise vocoder
This paper presents an ARX-LF-based model of speech that is amenable to low-bit-rate quantization and speech modifications directly at the parametric domain. The new model successfully addresses the non-deterministic part of voiced speech by modulating noise with the glottal flow, while unvoiced speech and transients are synthesized by modulating noise with a signal-derived time envelope. The p...
متن کاملAutomatic voice-source parameterization of natural speech
We present here our work in automatic parameterization of natural speech by means of a pitch synchronous source-filter decomposition algorithm. The derivative glottal source is modelled using the Liljencrants-Fant (LF) model. The model parameters are obtained simultaneously with the coefficients of an all-pole filter representing the vocal tract response by means of a quadratic programming algo...
متن کاملAn improved speech analysis-synthesis algorithm based on the autoregressive with exogenous input speech production model
Ding et al. have explored a novel pitch-synchronous speech analysis-synthesis method[1] based on an auto-regressive with exogenous input (ARX) speech production model. This method makes an automatic estimation of the vocal tract (formant) and voice source parameters from a speech utterance. This method, however, has suffered deficiencies in the analysis of a high-pitch voice and the introductio...
متن کاملExpressive Speech Synthesis: Evaluation of a Voice Quality Centered Coder on the Different Acoustic Dimensions
Expressive speech is intrinsically multi-dimensional. Each acoustic dimension has specific weights depending on the nature of the expressed affects. The quantity of expressive information carried by each dimension separately (using Praat algorithms), as well as the processing implied to carry it (global value vs. contour) has been perceptively measured for a set of natural mono-syllabic utteran...
متن کاملThe LF-model revisited. Transformations and frequency domain analysis
The main theme of our presentation is the parameterization of the human voice source. The LF-model provides an eflective representation of the inverse Jiltered speech wave form, but the four parameters are not easily handled in analysis and synthesis of connected speech. A data reduction scheme has recently been proposed (Fant et al., 1994) as a supplement to complete specifications. This syste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011