نتایج جستجو برای: phoneme classification

تعداد نتایج: 496610  

1998
Axel Glaeser

We present a Modular Neural Network (MNN) for phoneme recognition within the framework of a hybrid system (neural networks and HMMs) for speakerindependent single word recognition. With this approach, we are taking the computational effort into account which is used as an additional criterion for assessing the system performance. The main idea of the proposed MNN is the distribution of the comp...

2002
Tommi Lahti

In this paper, a novel Out-of-Vocabulary (OOV) word detection method relying on phoneme-level acoustic measures and Support Vector Machines (SVM) is proposed. Word level OOV scores are computed from the phoneme level in-vocabulary (IV) and OOV information provided by an HMM based speech recognizer. The OOV word decision is based on the confidence feature vector which is processed by a SVM class...

2011
Shang-wen Li Yow-Bang Wang Liang-Che Sun Lin-Shan Lee

We propose an improved Tandem system for tonal language speech recognition. Three different types of features, cepstral, spectro-temporal and pitch features, are integrated for modeling tone and phoneme variation simultaneously. Tonal phonemes (or tonemes) are used for MLP posterior estimation, and tonal acoustic units for HMM recognition. In our experiments conducted on Mandarin broadcast news...

2014
Gábor Gosztolya József Dombi

When combining classifiers, we aggregate the output of different machine learning methods, and base our decision on the aggregated probability values instead of the individual ones. In the phoneme classification task of speech recognition, small excerpts of speech need to be identified as one of the pre-defined phonemes; but the probability value assigned to each possible phoneme also hold valu...

2007
Sittichai Jiampojamarn Grzegorz Kondrak Tarek Sherif

Letter-to-phoneme conversion generally requires aligned training data of letters and phonemes. Typically, the alignments are limited to one-to-one alignments. We present a novel technique of training with many-to-many alignments. A letter chunking bigram prediction manages double letters and double phonemes automatically as opposed to preprocessing with fixed lists. We also apply an HMM method ...

2006
Iosif Mporas Todor Ganchev Panagiotis Zervas Nikos Fakotakis

In the present work we study the applicability of Support Vector Machines (SVMs) on the phoneme recognition task. Specifically, the Least Squares version of the algorithm (LS-SVM) is employed in recognition of the Greek phonemes in the framework of telephone-driven voice-enabled information service. The N-best candidate phonemes are identified and consequently feed to the speech and language re...

2010
Jibran Yousafzai Zoran Cvetković Peter Sollich

This work is concerned with improving the robustness of phoneme classification to additive noise with hybrid features using support vector machines (SVMs). In particular, the cepstral features are combined with local energy features of acoustic waveform segments to form a hybrid representation. The local energy features are taken into account separately in the SVM kernel, and a simple subtracti...

2017
Chunxi Liu Jan Trmal Matthew Wiesner Craig Harman Sanjeev Khudanpur

Modern topic identification (topic ID) systems for speech use automatic speech recognition (ASR) to produce speech transcripts, and perform supervised classification on such ASR outputs. However, under resource-limited conditions, the manually transcribed speech required to develop standard ASR systems can be severely limited or unavailable. In this paper, we investigate alternative unsupervise...

2013
Keigo Kubo Sakriani Sakti Graham Neubig Tomoki Toda Satoshi Nakamura

The current state-of-the-art approach in grapheme-to-phoneme (g2p) conversion is structured learning based on the Margin Infused Relaxed Algorithm (MIRA), which is an online discriminative training method for multiclass classification. However, it is known that the aggressive weight update method of MIRA is prone to overfitting, even if the current example is an outlier or noisy. Adaptive Regul...

2011
Shinsuke Mori Graham Neubig

In this paper, we propose a pointwise approach to the Japanese TTS front-end. In this approach, phoneme sequence estimation of sentences is decomposed into two tasks: word segmentation of the input sentence and phoneme estimation of each word. Then these two tasks are solved by pointwise classifiers without referring to the neighboring classification results. In contrast to existing sequence-ba...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید