A multi-pass error detection and correction framework for Mandarin LVCSR

نویسندگان

Zhengyu Zhou

Helen M. Meng

Wai Kit Lo

چکیده

We previously proposed a multi-pass framework for Large Vocabulary Continuous Speech Recognition (LVCSR). The objective of this framework is to apply sophisticated linguistic models for recognition, while maintaining a balance between complexity and efficiency. The framework is composed of three passes: initial recognition, error detection and error correction. This paper presents and evaluates a prototype of the multi-pass framework based on Mandarin dictation. In this prototype, the first pass recognizes speech with a well-trained state-of-the-art recognizer incorporating an efficient language model; the second pass detects recognition errors by a new three-step error detection procedure; and the third pass corrects errors detected in those lightly erroneous utterances by a novel error correction approach. The error correction algorithm corrects recognition errors by first creating candidate lists for errors, and then re-ranking the candidates with a combined model of mutual information and trigram. Mandarin dictation experiments show a relative reduction of 4% in character error rate (CER) over the initial recognition performance based on those light erroneous utterances detected.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Framework for Mandarin Lvcsr Based on One-pass Decoder

This paper describes a new framework based on one-pass and decision tree based class-triphone acoustic modeling for Mandarin LVCSR. Compared with the multi-pass decoder, it should be more knowledgeable and efficient as all sources are used at the same time when the decoder could be well organized and optimized. We give a detail about the organization of our one-pass decoder and how to handle th...

متن کامل

Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR

NLPR has been with long efforts on Mandarin speech recognition. This paper reports our recent process in this field with several significant novel characteristics: 1) Very large speech databases are used to learn more robust acoustic model; 2) Acoustic model has evolved from non-tonal class-triphone to tonal class-triphone based on tone-embedded decision tree, namely unified tone & triphone mod...

متن کامل

Integrating Multi-level Linguistic Knowledge with a Unified Framework for Mandarin Speech Recognition

To improve the Mandarin large vocabulary continuous speech recognition (LVCSR), a unified framework based approach is introduced to exploit multi-level linguistic knowledge. In this framework, each knowledge source is represented by a Weighted Finite State Transducer (WFST), and then they are combined to obtain a so-called analyzer for integrating multi-level knowledge sources. Due to the unifo...

متن کامل

An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition

This paper presents an empirical study of word error minimization approaches for Mandarin large vocabulary continuous speech recognition (LVCSR). First, the minimum phone error (MPE) criterion, which is one of the most popular discriminative training criteria, is extensively investigated for both acoustic model training and adaptation in a Mandarin LVCSR system. Second, the word error minimizat...

متن کامل

Online speaker adaptation and tracking for real-time speech recognition

This paper describes a low-latency online speaker adaptation framework. The main objective is to apply fast speaker adaptation to a real-time (RT) large vocabulary continuous speech recognition (LVCSR) engine. In this framework, speaker adaptation is performed on speaker turns generated by online speaker change detection and speaker clustering. To maximize long-term system performance, the adap...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

A multi-pass error detection and correction framework for Mandarin LVCSR

نویسندگان

چکیده

منابع مشابه

A New Framework for Mandarin Lvcsr Based on One-pass Decoder

Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR

Integrating Multi-level Linguistic Knowledge with a Unified Framework for Mandarin Speech Recognition

An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition

Online speaker adaptation and tracking for real-time speech recognition

عنوان ژورنال:

اشتراک گذاری