phoneme recognition

Arabic phonemes transcription using data driven approach

Journal: :Int. Arab J. Inf. Technol. 2015

Khalid M. O Nahar Husni Al-Muhtaseb Wasfi G. Al-Khatib Moustafa Elshafei Mansour Al-Ghamdi

The efficiency and correctness of continuous Arabic Speech Recognition Systems (ARS) hinge on the accuracy of the language phoneme set. The main goal of this research is to recognize and transcribe Arabic phonemes using a data-driven approach. We used the Hidden Markov Toolkit (HTK) to develop a phoneme recognizer, carrying out several experiments with different parameters, such as varying numb...

متن کامل

Using phoneme recognition and text-dependent speaker verification to improve speaker segmentation for Chinese speech

2010

Gang Wang Xiaojun Wu Thomas Fang Zheng

Speaker segmentation is widely used in many tasks such as multi-speaker detection and speaker tracking. The segmentation performance depends on the performance of speaker verification (SV) between two short utterances to a large extent, so the improvement of the SV performance for short utterances would give the segmentation performance a great help. In this paper, a method based on phoneme rec...

متن کامل

Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings

2018

Da-Rong Liu Kuan-Yu Chen Hung-Yi Lee Lin-shan Lee

Unsupervised discovery of acoustic tokens from audio corpora without annotation and learning vector representations for these tokens have been widely studied. Although these techniques have been shown successful in some applications such as query-byexample Spoken Term Detection (STD), the lack of mapping relationships between these discovered tokens and real phonemes have limited the down-strea...

متن کامل

Selected Papers of the IEEE International Conference on Computer and Information Technology

2010

Manzur Murshed Manoranjan Paul Shuqun Zhang Mohammad Ataul Karim Bernd J. Kröger Dongmei Wang Xiaozhong Yang Chia-Chen Chang Gang Xie Tianrui Cao Chengdong Yan Zhifeng Wu

This paper presents a distinctive phonetic features (DPFs) based phoneme recognition method by incorporating syllable language models (LMs). The method comprises three stages. The first stage extracts three DPF vectors of 15 dimensions each from local features (LFs) of an input speech signal using three multilayer neural networks (MLNs). The second stage incorporates an Inhibition/Enhancement (...

متن کامل

Recurrent Neural Network-Based Phoneme Sequence Estimation Using Multiple ASR Systems' Outputs for Spoken Term Detection

2016

Naoki Sawada Hiromitsu Nishizaki

This paper describes a novel correct phoneme sequence estimation method that uses a recurrent neural network (RNN)-based framework for spoken term detection (STD). In an automatic speech recognition (ASR)-based STD framework, ASR performance (word or subword error rate) affects STD performance. Therefore, it is important to reduce ASR errors to obtain good STD results. In this study, we use an ...

متن کامل

A study of noise robustness for speaker independent speech recognition method using phoneme similarity vector

1994

Masakatsu Hoshimi Maki Yamada Katsuyuki Niyada Shozo Makino

As an input method for rapidly spreading small portable information devices, development of speaker independent speech recognition technology which can be embedded on a single DSP is now urgently requested. We have reported a speech recognition method using phoneme similarity vector as a feature vector, which is quite robust for reduction of precision of the feature parameter. We’ve also develo...

متن کامل

Finding phonemes: improving machine lip-reading

2015

Helen L. Bear Richard Harvey Yuxuan Lan

In machine lip-reading there is continued debate and research around the correct classes to be used for recognition. In this paper we use a structured approach for devising speaker-dependent viseme classes, which enables the creation of a set of phoneme-to-viseme maps where each has a different quantity of visemes ranging from two to 45. Viseme classes are based upon the mapping of articulated ...

متن کامل

Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks

2016

Daan van Esch Mason Chua Kanishka Rao

Word pronunciations, consisting of phoneme sequences and the associated syllabification and stress patterns, are vital for both speech recognition and text-to-speech (TTS) systems. For speech recognition phoneme sequences for words may be learned from audio data. We train recurrent neural network (RNN) based models to predict the syllabification and stress pattern for such pronunciations making...

متن کامل

Second language speech recognition using multiple-pass decoding with lexicon represented by multiple reduced phoneme sets

2015

Xiaoyun Wang Seiichi Yamamoto

Considering that the pronunciation of second language speech is usually influenced by the mother tongue, we previously proposed using a reduced phoneme set for second language when the mother tongue of speakers is known. However, the proficiency of second language speakers varies widely, as does the influence of mother tongue on their pronunciation. Consequently, the optimal phoneme set is depe...

متن کامل

Effects of Syllable Language Model on Distinctive Phonetic Features (DPFs) based Phoneme Recognition Performance

Journal: :Journal of Multimedia 2010

Mohammad Nurul Huda Manoj Banik Muhammad Ghulam Mashud Kabir Bernd J. Kröger

This paper presents a distinctive phonetic features (DPFs) based phoneme recognition method by incorporating syllable language models (LMs). The method comprises three stages. The first stage extracts three DPF vectors of 15 dimensions each from local features (LFs) of an input speech signal using three multilayer neural networks (MLNs). The second stage incorporates an Inhibition/Enhancement (...

متن کامل