Evaluation of formant-like features on an automatic vowel classification task.

نویسندگان

  • Febe de Wet
  • Katrin Weber
  • Louis Boves
  • Bert Cranen
  • Samy Bengio
  • Hervé Bourlard
چکیده

Numerous attempts have been made to find low-dimensional, formant-related representations of speech signals that are suitable for automatic speech recognition. However, it is often not known how these features behave in comparison with true formants. The purpose of this study was to compare two sets of automatically extracted formant-like features, i.e., robust formants and HMM2 features, to hand-labeled formants. The robust formant features were derived by means of the split Levinson algorithm while the HMM2 features correspond to the frequency segmentation of speech signals obtained by two-dimensional hidden Markov models. Mel-frequency cepstral coefficients (MFCCs) were also included in the investigation as an example of state-of-the-art automatic speech recognition features. The feature sets were compared in terms of their performance on a vowel classification task. The speech data and hand-labeled formants that were used in this study are a subset of the American English vowels database presented in Hillenbrand et al. [J. Acoust. Soc. Am. 97, 3099-3111 (1995)]. Classification performance was measured on the original, clean data and in noisy acoustic conditions. When using clean data, the classification performance of the formant-like features compared very well to the performance of the hand-labeled formants in a gender-dependent experiment, but was inferior to the hand-labeled formants in a gender-independent experiment. The results that were obtained in noisy acoustic conditions indicated that the formant-like features used in this study are not inherently noise robust. For clean and noisy data as well as for the gender-dependent and gender-independent experiments the MFCCs achieved the same or superior results as the formant features, but at the price of a much higher feature dimensionality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of formant-like features for automatic speech recognition

This study investigates possibilities to find a low-dimensional, formant-related physical representation of speech signals, which is suitable for automatic speech recognition. This aim is motivated by the fact that formants are known to be discriminant features for speech recognition. Combinations of automatically extracted formant-like features and state-of-the-art, noise-robust features have ...

متن کامل

Formant Frequencies under Cognitive Load: Effects and Classification

Cognitive load measurement systems measure the mental demand experienced by human while performing a cognitive task, which is useful in monitoring and enhancing task performance. Various speech-based systems have been proposed for cognitive load classification, but the effect of cognitive load on the speech production system is still not well understood. In this work, we study formant frequenci...

متن کامل

Evaluation of formant-like features for ASR

This paper investigates possibilities to automatically find a low-dimensional, formant-related physical representation of the speech signal, which is suitable for automatic speech recognition (ASR). This aim is motivated by the fact that formants have been shown to be discriminant features for ASR. Combinations of automatically extracted formant-like features and ‘conventional’, noiserobust, st...

متن کامل

Acoustic Analysis of Persian EFL Learners' Pronunciation of English Vowels

This paper reports the results of an experimental study on non-native production of English vowels. Two groups of Persian EFL learners varying in language proficiency were tested on their ability to produce the nine plain vowels of American English. Vowel production accuracy was assessed by means of acoustic measurements. Ladefoged and Maddison’s (1996) F1 F2 measurements for American English v...

متن کامل

Estimating vowel formant discrimination thresholds using a single-interval classification task.

Previous research estimating vowel formant discrimination thresholds in words and sentences has often employed a modified two-alternative-forced-choice (2AFC) task with adaptive tracking. Although this approach has produced stable data, the length and number of experimental sessions, as well as the unnaturalness of the task, limit generalizations of results to ordinary speech communication. In ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • The Journal of the Acoustical Society of America

دوره 116 3  شماره 

صفحات  -

تاریخ انتشار 2004