phoneme classification

Evolutionary Splines for Cepstral Filterbank Optimization in Phoneme Classification

Journal: :EURASIP J. Adv. Sig. Proc. 2011

Leandro Daniel Vignolo Hugo Leonardo Rufiner Diego H. Milone John C. Goddard

Mel-frequency cepstral coefficients have long been the most widely used type of speech representation. They were introduced to incorporate biologically inspired characteristics into artificial speech recognizers. Recently, the introduction of new alternatives to the classic mel-scaled filterbank has led to improvements in the performance of phoneme recognition in adverse conditions. In this wor...

متن کامل

Discriminative phoneme sequence extraction for non-native speaker's origin classification

2007

Ghazi Bouselmi Dominique Fohr Irina Illina Jean Paul Haton

In this paper we present an automated method for the classification of the origin of non-native speakers. The origin of non-native speakers could be identified by a human listener based on the detection of typical pronunciations for each nationality. Thus we suppose the existence of several phoneme sequences that might allow the classification of the origin of non-native speakers. Our new metho...

متن کامل

Genetic Optimization of Cepstrum Filterbank for Phoneme Classification

2009

Leandro Daniel Vignolo Hugo Leonardo Rufiner Diego H. Milone John C. Goddard

Some of the most commonly used speech representations, such as mel-frequency cepstral coefficients, incorporate biologically inspired characteristics into artificial systems. Recent advances have been introduced modifying the shape and distribution of the traditional perceptually scaled filterbank, commonly used for feature extraction. Some alternatives to the classic mel scaled filterbank have...

متن کامل

Wavelet based feature extraction for phoneme recognition

1996

Christopher John Long Sekharajit Datta

In an effort to provide a more efficient representation of the acoustical speech signal in the pre-classification stage of a speech recognition system, we consider the application of the Best-Basis Algorithm of Coifman and Wickerhauser. This combines the advantages of using a smooth, compactly-supported wavelet basis with an adaptive time-scale analysis dependent on the problem at hand. We star...

متن کامل

Study of attractor variation in the reconstructed phase space of speech signals

2003

Jinjin Ye Michael T. Johnson Richard J. Povinelli

This paper presents a study of the attractor variation in the reconstructed phase spaces of isolated phonemes. The approach is based on recent work in timedomain signal classification using dynamical signal models, whereby a statistical distribution model is obtained from the phase space and used for maximum likelihood classification. Two sets of experiments are presented in this paper. The fir...

متن کامل

Auditory Cortical Representations of Speech Signals for Phoneme Classification

2007

Hugo Leonardo Rufiner César E. Martínez Diego H. Milone John C. Goddard

The use of biologically inspired, feature extraction methods has improved the performance of artificial systems that try to emulate some aspect of human communication. Recent techniques, such as independent component analysis and sparse representations, have made it possible to undertake speech signal analysis using features similar to the ones found experimentally at the primary auditory corte...

متن کامل

Phoneme Classification Using Naive Bayes Classifier in Reconstructed Phase Space

2002

Jinjin Ye Richard J. Povinelli Michael T. Johnson

A novel method for classifying speech phonemes is presented. Unlike traditional cepstral based methods, this approach uses histograms of reconstructed phase spaces. A Naïve Bayes classifier uses the probability mass estimates for classification. The approach is verified using isolated fricative, vowel, and nasal phonemes from the TIMIT corpus. The results show that a reconstructed phase space a...

متن کامل

Diagnostics of speech recognition using classification phoneme diagnostic trees

2006

Milos Cernak Christian Wellekens

More than three decades of speech recognition research resulted in a very sophisticated statistical framework. However, less attention was still devoted to diagnostics of speech recognition; most previous research report on results in terms of ever-lower WER in various intrinsic or environmental conditions. This paper presents a diagnostics of the decoding process of ASR systems. The purpose of...

متن کامل

Keyword Spotting in A-capella Singing

2014

Anna M. Kruspe

Keyword spotting (or spoken term detection) is an interesting task in Music Information Retrieval that can be applied to a number of problems. Its purposes include topical search and improvements for genre classification. Keyword spotting is a well-researched task on pure speech, but state-of-the-art approaches cannot be easily transferred to singing because phoneme durations have much higher v...

متن کامل

Decision Combination in Speech Metadata Extraction

2003

Xiaofan Lin

Speech metadata extraction can both improve speech recognition and enable novel Interactive Voice Response applications. Unlike the previous research, which concentrates on the frame-level signal processing and pattern classification, this paper systematically studies the behavior of decision combination at the utterance level. We analyze the asymptotic characteristics, and the factors affectin...

متن کامل