Improved formant frequency measurements of short segments
نویسنده
چکیده
We describe an algorithm that automatically finds the smoothest formant trajectories for short segments of speech. The method selects for each segment the smoothest from a number of alternatives. The smoothness criterion is based on the modeling of formant tracks with polynomial functions and uses both the χ2 badness-of-fit as well as the variances of the polynomial coefficients. A great advantage with respect to other methods is that it is completely automatic and reproducible because of our new criterion that quantifies the smoothness of formant tracks. Applied to some speech corpora, the new method shows smaller spreading ellipses especially for male’s high back vowels.
منابع مشابه
Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملImproved phone recognition on TIMIT using formant frequency data and confidence measures
This paper presents a novel approach to integration of formant frequency and conventional MFCC data in phone recognition experiments on TIMIT. Naive use of format data introduces classification errors if formant frequency estimates are poor, resulting in a net drop in performance. However, by exploiting a measure of confidence in the formant frequency estimates, formant data can contribute to c...
متن کاملReducing one-to-many problem in Voice Conversion by equalizing the formant locations using dynamic frequency warping
In this study, we investigate a solution to reduce the effect of oneto-many problem in voice conversion. One-to-many problem in VC happens when two very similar speech segments in source speaker have corresponding speech segments in target speaker that are not similar to each other. As a result, the mapper function usually oversmoothes the generated features in order to be similar to both targe...
متن کاملFormant frequency prediction from MFCC vectors in noisy environments
This paper proposes a method of predicting the formant frequencies of a frame of speech from its mel-frequency cepstral coefficient (MFCC) representation. Prediction is achieved through the creation of a Gaussian mixture model (GMM) which models the joint density of formant frequencies and MFCCs. Using this GMM and an input MFCC vector, a maximum a posteriori (MAP) prediction of the formant fre...
متن کاملWavelet ridge track interpretation in terms of formants
This paper proposes two new approaches for formant tracking using Fourier and wavelet ridges. The speech signal is decomposed into Time-Frequency representations issued from windowed Fourier transform and wavelet transform. Formant tracking is achieved by exploring ridges from time-frequency representation and imposing continuity constraints on formant trajectories. These approaches are validat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015