phoneme

Optimization of Phoneme-based Vq Codebook in a Dhmm System

1994

Yaxin Zhang Roberto Togneri Chris deSilva Mike Alder

A phoneme-based Gaussian mixture VQ codebook can improve the conventional DHMM system performance signiicantly. In this paper, an optimization method for the phoneme-based VQ codebook is proposed. The experimental results shown that the optimized phoneme-based VQ codebook leads to both the improvement of system performance and the reduction of system complexity.

متن کامل

The use of wavelet transforms in phoneme recognition

1996

Beng T. Tan Minyue Fu Andrew Spray Phillip Dermody

This study investigates the usefulness of wavelet transforms in phoneme recognition. Both discrete wavelet transforms (DWT) and sampled continuous wavelet transforms (SCWT) are tested. The wavelet transform is used as a part of the front-end processor which extracts feature vectors for a speakerindependent HMM-based phoneme recognizer. The results are evaluated on a portion of TIMIT corpus cons...

متن کامل

Enhancing Phoneme Recognizer Performance with a Simple Rule-based Language Model

2002

Pertti Väyrynen Johannes Peltola Tapio Seppänen

The phoneme classification inaccuracy at the acoustic phonetic level is a major weakness in most speech recognition systems. However, the inaccuracy will violate phonotactic constraints at the acoustic phonetic level. A better performance is expected if a language model is adopted in a recognition system for post-processing phoneme estimates and making corrections with a set of explicit rules o...

متن کامل

Phoneme recognition using visual features on speech spectrograms

1987

Shigeru Katagiri Manami Yokota

In order to apply speech spectrogram reading heuristics to an automatic speech recognition system, a more accurate expression of the heuristics must be developed. In particular, the transformation between acoustic feature measurements and phoneme candidates must be developed in a quantitative manner. In this paper, a visual acoustic-feature labeland a phoneme identification approach using this ...

متن کامل

Learning lexicons from spoken utterances based on statistical model selection

2009

Ryo Taguchi Naoto Iwahashi Takashi Nose Kotaro Funakoshi Mikio Nakano

This paper proposes a method for the unsupervised learning of lexicons from pairs of a spoken utterance and an object as its meaning without any a priori linguistic knowledge other than a phoneme acoustic model. In order to obtain a lexicon, a statistical model of the joint probability of a spoken utterance and an object is learned based on the minimum description length principle. This model c...

متن کامل

Phoneme Dedicated ANN Improves Segmental Duration Model

2008

João Paulo Teixeira Diamantino Freitas

The Phoneme Dedicated Artificial Neural Network (PDANN) segmental duration model consists of a set of ANNs trained specifically for each phoneme segment in order to avoid miscellaneous influence of different types of phoneme segments. Therefore, each ANN is dedicated to predict the duration of a specific phoneme segment. Objective and subjective measurements of the performance of the PDANN mode...

متن کامل

Improving Non-Native ASR Through Stochastic Multilingual Phoneme Space Transformations

2011

David Imseng Hervé Bourlard John Dines Philip N. Garner Mathew Magimai-Doss

We propose a stochastic phoneme space transformation technique that allows the conversion of conditional source phoneme posterior probabilities (conditioned on the acoustics) into target phoneme posterior probabilities. The source and target phonemes can be in any language and phoneme format such as the International Phonetic Alphabet. The novel technique makes use of a Kullback-Leibler diverge...

متن کامل

Towards Lower Error Rates in Phoneme Recognition

2004

Petr Schwarz Pavel Matejka Jan Cernocký

We investigate techniques for acoustic modeling in automatic recognition of context-independent phoneme strings from the TIMIT database. The baseline phoneme recognizer is based on TempoRAl Patterns (TRAP). This recognizer is simplified to shorten processing times and reduce computational requirements. More states per phoneme and bi-gram language models are incorporated into the system and eval...

متن کامل

Perceptual Development of a New Phoneme Contrast by Adult and 12-year-old Listeners

2004

Willemijn Heeren

How does the perception of a new phoneme contrast develop? Are differences found across age groups? In answering these questions, we use two alternative hypotheses: i) Acquired Distinctiveness: before learning, differences between and within phoneme categories are relatively hard to discriminate. Through training, the phoneme boundary is learned. ii) Acquired Similarity: before learning, differ...

متن کامل

A method of generating English pronunciation dictionary for Japanese English recognition systems

2000

Tadashi Suzuki Jun Ishii Kunio Nakajima

In this paper, we propose a method for generating a pronunciation dictionary—extracting typical pronunciations for each word from speech data uttered by Japanese speakers—as one approach to speech recognition targeting English speech uttered by Japanese speakers whose mother tongue is not English. This method includes three processes: a process in which English phoneme HMMs (Hidden Markov Model...

متن کامل