phoneme recognition

Relative Contributions of Spectral and Temporal Cues to Korean Phoneme Recognition

Journal: :PLOS ONE 2015

Speaker adaptation for non-native speakers using bilingual English lexicon and acoustic models

2003

Shoichi Matsunaga Atsunori Ogawa Yoshikazu Yamaguchi Akihiro Imamura

This paper proposes a supervised speaker adaptation method that is effective for both non-native (i.e. Japanese) and native English speakers’ pronunciation of English speech. This method uses English and Japanese phoneme acoustic models and a pronunciation lexicon in which each word has both English and Japanese phoneme transcriptions. The same utterances are used for adaptation of both acousti...

متن کامل

Detection of Phoneme Boundaries Using Spiking Neurons

2008

Gábor Gosztolya László Tóth

Automatic speech recognition (ASR) is an area where the task is to assign the correct phoneme or word sequence to an utterance. The idea behind the ASR segment-based approach is to treat one phoneme as a whole unit in every respect, in contrast with the framebased approach where it is divided into equal-sized, smaller chunks. Doing this has many advantages, but also gives rise to some new probl...

متن کامل

Transient map method in stop consonant discrimination

1989

Jari Kangas Teuvo Kohonen

Discrimination between the voiceless stop consonants !k,p,tl is a subproblern in phoneme-based speech recognition systems. Lack of energy during the pronunciation and the fast transient effects at the end of the phoneme make the recognition difficult. A method of so called Phonotopic Maps [2] was studied in order to develop simple and effective solutions for discrimination. In the following stu...

متن کامل

Neural networks for text-to-speech phoneme recognition

2000

Mark J. Embrechts Fabio A. Arciniegas

This paper presents two different artificial neural network approaches for phoneme recognition for text-to-speech applications: Staged Backpropagation Neural Networks and SelfOrganizing Maps. Several current commercial approaches rely on an exhaustive dictionary approach for text-to-phoneme conversion. Applying neural networks for phoneme mapping for text-to-speech conversion creates a fast dis...

متن کامل

Evolutionary Splines for Cepstral Filterbank Optimization in Phoneme Classification

Journal: :EURASIP J. Adv. Sig. Proc. 2011

Leandro Daniel Vignolo Hugo Leonardo Rufiner Diego H. Milone John C. Goddard

Mel-frequency cepstral coefficients have long been the most widely used type of speech representation. They were introduced to incorporate biologically inspired characteristics into artificial speech recognizers. Recently, the introduction of new alternatives to the classic mel-scaled filterbank has led to improvements in the performance of phoneme recognition in adverse conditions. In this wor...

متن کامل

Sparse auto-associative neural networks: theory and application to speech recognition

2010

Garimella S. V. S. Sivaram Sriram Ganapathy Hynek Hermansky

This paper introduces the sparse auto-associative neural network (SAANN) in which the internal hidden layer output is forced to be sparse. This is achieved by adding a sparse regularization term to the original reconstruction error cost function, and updating the parameters of the network to minimize the overall cost. We show applicability of this network to phoneme recognition by extracting sp...

متن کامل

An acoustic-phonetic feature-based system for automatic phoneme recognition in continuous speech

1999

Ahmed M. Abdelatty Ali Jan Van der Spiegel Paul Mueller G. Haentjens J. Berman

An acoustic-phonetic featureand knowledge-based system for the automatic segmentation, broad categorization and fine phoneme recognition of continuous speech is described. The system uses an auditory-based front-end processing and incorporates new knowledge-based algorithms to automatically segments the speech into phoneme-like segments that are further categorized into 4 main categories: sonor...

متن کامل

A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines

2011

Sabato Marco Siniscalchi Torbjørn Svendsen Chin-Hui Lee

A bottom-up, stepwise, knowledge integration framework is proposed to realize detection-based, large vocabulary continuous speech recognition (LVCSR) with a weighted finite state machine (WFSM). The WFSM framework offers a flexible architecture for different types of knowledge network compositions, each of them can be built and optimized independently. Speech attribute detectors are used as an ...

متن کامل

SABR: sparse, anchor-based representation of the speech signal

2015

Christopher Liberatore Sandesh Aryal Zelun Wang Seth Polsley Ricardo Gutierrez-Osuna

We present SABR (Sparse, Anchor-Based Representation), an analysis technique to decompose the speech signal into speaker-dependent and speaker-independent components. Given a collection of utterances for a particular speaker, SABR uses the centroid for each phoneme as an acoustic “anchor,” then applies Lasso regularization to represent each speech frame as a sparse non-negative combination of t...

متن کامل