نتایج جستجو برای: phoneme recognition

تعداد نتایج: 254307  

2003
Shoichi Matsunaga Atsunori Ogawa Yoshikazu Yamaguchi Akihiro Imamura

This paper proposes a supervised speaker adaptation method that is effective for both non-native (i.e. Japanese) and native English speakers’ pronunciation of English speech. This method uses English and Japanese phoneme acoustic models and a pronunciation lexicon in which each word has both English and Japanese phoneme transcriptions. The same utterances are used for adaptation of both acousti...

2008
Gábor Gosztolya László Tóth

Automatic speech recognition (ASR) is an area where the task is to assign the correct phoneme or word sequence to an utterance. The idea behind the ASR segment-based approach is to treat one phoneme as a whole unit in every respect, in contrast with the framebased approach where it is divided into equal-sized, smaller chunks. Doing this has many advantages, but also gives rise to some new probl...

1989
Jari Kangas Teuvo Kohonen

Discrimination between the voiceless stop consonants !k,p,tl is a subproblern in phoneme-based speech recognition systems. Lack of energy during the pronunciation and the fast transient effects at the end of the phoneme make the recognition difficult. A method of so called Phonotopic Maps [2] was studied in order to develop simple and effective solutions for discrimination. In the following stu...

2000
Mark J. Embrechts Fabio A. Arciniegas

This paper presents two different artificial neural network approaches for phoneme recognition for text-to-speech applications: Staged Backpropagation Neural Networks and SelfOrganizing Maps. Several current commercial approaches rely on an exhaustive dictionary approach for text-to-phoneme conversion. Applying neural networks for phoneme mapping for text-to-speech conversion creates a fast dis...

Journal: :EURASIP J. Adv. Sig. Proc. 2011
Leandro Daniel Vignolo Hugo Leonardo Rufiner Diego H. Milone John C. Goddard

Mel-frequency cepstral coefficients have long been the most widely used type of speech representation. They were introduced to incorporate biologically inspired characteristics into artificial speech recognizers. Recently, the introduction of new alternatives to the classic mel-scaled filterbank has led to improvements in the performance of phoneme recognition in adverse conditions. In this wor...

2010
Garimella S. V. S. Sivaram Sriram Ganapathy Hynek Hermansky

This paper introduces the sparse auto-associative neural network (SAANN) in which the internal hidden layer output is forced to be sparse. This is achieved by adding a sparse regularization term to the original reconstruction error cost function, and updating the parameters of the network to minimize the overall cost. We show applicability of this network to phoneme recognition by extracting sp...

1999
Ahmed M. Abdelatty Ali Jan Van der Spiegel Paul Mueller G. Haentjens J. Berman

An acoustic-phonetic featureand knowledge-based system for the automatic segmentation, broad categorization and fine phoneme recognition of continuous speech is described. The system uses an auditory-based front-end processing and incorporates new knowledge-based algorithms to automatically segments the speech into phoneme-like segments that are further categorized into 4 main categories: sonor...

2011
Sabato Marco Siniscalchi Torbjørn Svendsen Chin-Hui Lee

A bottom-up, stepwise, knowledge integration framework is proposed to realize detection-based, large vocabulary continuous speech recognition (LVCSR) with a weighted finite state machine (WFSM). The WFSM framework offers a flexible architecture for different types of knowledge network compositions, each of them can be built and optimized independently. Speech attribute detectors are used as an ...

2015
Christopher Liberatore Sandesh Aryal Zelun Wang Seth Polsley Ricardo Gutierrez-Osuna

We present SABR (Sparse, Anchor-Based Representation), an analysis technique to decompose the speech signal into speaker-dependent and speaker-independent components. Given a collection of utterances for a particular speaker, SABR uses the centroid for each phoneme as an acoustic “anchor,” then applies Lasso regularization to represent each speech frame as a sparse non-negative combination of t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید