نتایج جستجو برای: phoneme classification

تعداد نتایج: 496610  

Journal: :EURASIP J. Adv. Sig. Proc. 2011
Leandro Daniel Vignolo Hugo Leonardo Rufiner Diego H. Milone John C. Goddard

Mel-frequency cepstral coefficients have long been the most widely used type of speech representation. They were introduced to incorporate biologically inspired characteristics into artificial speech recognizers. Recently, the introduction of new alternatives to the classic mel-scaled filterbank has led to improvements in the performance of phoneme recognition in adverse conditions. In this wor...

2007
Ghazi Bouselmi Dominique Fohr Irina Illina Jean Paul Haton

In this paper we present an automated method for the classification of the origin of non-native speakers. The origin of non-native speakers could be identified by a human listener based on the detection of typical pronunciations for each nationality. Thus we suppose the existence of several phoneme sequences that might allow the classification of the origin of non-native speakers. Our new metho...

2009
Leandro Daniel Vignolo Hugo Leonardo Rufiner Diego H. Milone John C. Goddard

Some of the most commonly used speech representations, such as mel-frequency cepstral coefficients, incorporate biologically inspired characteristics into artificial systems. Recent advances have been introduced modifying the shape and distribution of the traditional perceptually scaled filterbank, commonly used for feature extraction. Some alternatives to the classic mel scaled filterbank have...

1996
Christopher John Long Sekharajit Datta

In an effort to provide a more efficient representation of the acoustical speech signal in the pre-classification stage of a speech recognition system, we consider the application of the Best-Basis Algorithm of Coifman and Wickerhauser. This combines the advantages of using a smooth, compactly-supported wavelet basis with an adaptive time-scale analysis dependent on the problem at hand. We star...

2003
Jinjin Ye Michael T. Johnson Richard J. Povinelli

This paper presents a study of the attractor variation in the reconstructed phase spaces of isolated phonemes. The approach is based on recent work in timedomain signal classification using dynamical signal models, whereby a statistical distribution model is obtained from the phase space and used for maximum likelihood classification. Two sets of experiments are presented in this paper. The fir...

2007
Hugo Leonardo Rufiner César E. Martínez Diego H. Milone John C. Goddard

The use of biologically inspired, feature extraction methods has improved the performance of artificial systems that try to emulate some aspect of human communication. Recent techniques, such as independent component analysis and sparse representations, have made it possible to undertake speech signal analysis using features similar to the ones found experimentally at the primary auditory corte...

2002
Jinjin Ye Richard J. Povinelli Michael T. Johnson

A novel method for classifying speech phonemes is presented. Unlike traditional cepstral based methods, this approach uses histograms of reconstructed phase spaces. A Naïve Bayes classifier uses the probability mass estimates for classification. The approach is verified using isolated fricative, vowel, and nasal phonemes from the TIMIT corpus. The results show that a reconstructed phase space a...

2006
Milos Cernak Christian Wellekens

More than three decades of speech recognition research resulted in a very sophisticated statistical framework. However, less attention was still devoted to diagnostics of speech recognition; most previous research report on results in terms of ever-lower WER in various intrinsic or environmental conditions. This paper presents a diagnostics of the decoding process of ASR systems. The purpose of...

2014
Anna M. Kruspe

Keyword spotting (or spoken term detection) is an interesting task in Music Information Retrieval that can be applied to a number of problems. Its purposes include topical search and improvements for genre classification. Keyword spotting is a well-researched task on pure speech, but state-of-the-art approaches cannot be easily transferred to singing because phoneme durations have much higher v...

2003
Xiaofan Lin

Speech metadata extraction can both improve speech recognition and enable novel Interactive Voice Response applications. Unlike the previous research, which concentrates on the frame-level signal processing and pattern classification, this paper systematically studies the behavior of decision combination at the utterance level. We analyze the asymptotic characteristics, and the factors affectin...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید