نتایج جستجو برای: reporting error

تعداد نتایج: 342690  

2008
Antoine Laurent Téva Merlin Sylvain Meignier Yannick Estève Paul Deléglise

Large vocabulary automatic speech recognition (ASR) technologies perform well in known, controlled contexts. However recognition of proper nouns is commonly considered as a difficult task. Accurate phonetic transcription of a proper noun is difficult to obtain, although it can be one of the most important resources for a recognition system. In this article, we propose methods of automatic phone...

Journal: :Speech Communication 1999
Judith M. Kessens Mirjam Wester Helmer Strik

This article describes how the performance of a Dutch continuous speech recognizer was improved by modeling pronunciation variation. We propose a general procedure for modeling pronunciation variation. In short, it consists of adding pronunciation variants to the lexicon, retraining phone models and using language models to which the pronunciation variants have been added. First, within-word pr...

2016
Ewald van der Westhuizen Thomas Niesler

We consider the phenomenon of postlexical deletion in fast spontaneously spoken isiZulu speech and its implication for automatic speech recognition (ASR). Analysis of hand-crafted transcripts of fast spontaneous speech recorded from broadcast media indicates that postlexical deletion, especially of vowels, is common in isiZulu. We show that ASR performance can be increased by inclusion of pronu...

2014
Patrick Cardinal Ahmed M. Ali Najim Dehak Yu Zhang Tuka Al Hanai Yifan Zhang James R. Glass Stephan Vogel

This paper describes a detailed comparison of several state-ofthe-art speech recognition techniques applied to a limited Arabic broadcast news dataset. The different approaches were all trained on 50 hours of transcribed audio from the Al-Jazeera news channel. The best results were obtained using i-vectorbased speaker adaptation in a training scenario using the Minimum Phone Error (MPE) criteri...

2006
Luis Buera Eduardo Lleida Juan Arturo Nolazco-Flores Antonio Miguel Alfonso Ortega

In a previous work, Multi-Environment Model based LInear Normalization, MEMLIN, was presented and it was proved to be effective to compensate environment mismatch. MEMLIN is an empirical feature vector normalization which models clean and noisy spaces by Gaussian Mixture Models (GMMs). In this algorithm, the probability of the clean model Gaussian, given the noisy model one and the noisy featur...

2007
Takanobu Oba Takaaki Hori Atsushi Nakamura

This paper focuses on an error-corrective method through reranking of hypotheses in speech recognition. Some recent work investigated corrective models that can be used to rescore hypotheses so that a hypothesis with a smaller error rate has a higher score. Discriminative training such as perceptron algorithm can be used to estimate such corrective models. In discriminative training, how to cho...

2006
Cosmin Munteanu Gerald Penn Ronald Baecker Elaine Toms David James

The increased availability of broadband connections has recently led to an increase in the use of Internet broadcasting (webcasting). Most webcasts are archived and accessed numerous times retrospectively. One of the hurdles users face when browsing and skimming through archives is the lack of text transcripts of the audio channel of the webcast archive. In this paper, we proposed a procedure f...

2012
Ramya Rasipuram Mathew Magimai-Doss

In a recent work, we proposed an acoustic data-driven grapheme-to-phoneme (G2P) conversion approach, where the probabilistic relationship between graphemes and phonemes learned through acoustic data is used along with the orthographic transcription of words to infer the phoneme sequence. In this paper, we extend our studies to under-resourced lexicon development problem. More precisely, given a...

2003
Yonggang Deng Milind Mahajan Alex Acero

We address the problem of estimating the word error rate (WER) of an automatic speech recognition (ASR) system without using acoustic test data. This is an important problem which is faced by the designers of new applications which use ASR. Quick estimate of WER early in the design cycle can be used to guide the decisions involving dialog strategy and grammar design. Our approach involves estim...

2016
Markus Kitza Albert Zeyer Ralf Schlüter Jahn Heymann Reinhold Häb-Umbach

In this paper we present a system for robust online far-field multi-channel speech recognition with minimal assumptions on microphone configuration and target location. We employ an online-enabled Generalized Eigenvalue (GEV) beamformer and a Long Short-TermMemory (LSTM) network to robustly calculate the signal statistics necessary for the beamforming operation in the front-end. After multiple ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید