Data Driven Approaches to Phonetic Transcription with Integration of Automatic Speech Recognition and Grapheme-to-Phoneme for Spoken Buddhist Sutra

نویسندگان

Min-Siong Liang

Ren-Yuan Lyu

Yuang-Chin Chiang

چکیده

We propose a new approach for performing phonetic transcription of text that utilizes automatic speech recognition (ASR) to help traditional grapheme-to-phoneme (G2P) techniques. This approach was applied to transcribe Chinese text into Taiwanese phonetic symbols. By augmenting the text with speech and using automatic speech recognition with a sausage searching net constructed from multiple pronunciations of text, we are able to reduce the error rate of phonetic transcription. Using a pronunciation lexicon with multiple pronunciations for each item, a transcription error rate of 12.74% was achieved. Further improvement can be achieved by adapting the pronunciation lexicon with pronunciation variation (PV) rules derived manually from corrected transcription in a speech corpus. The PV rules can be categorized into two kinds: knowledge-based and data-driven rules. By incorporating the PV rules, an error rate of 10.56% could be achieved. Although this technique was developed for Taiwanese speech, it could easily be adapted to other Chinese spoken languages or dialects.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The multiple pronunciations in Taiwanese and the automatic transcription of Buddhist sutra with augmented read speech

Collection of Taiwanese text corpus with phonetic transcription suffers from the problems of multiple pronunciation, or pronunciation variation. By further augmenting the text with read speech, and using automatic speech recognition with a sausage searching net constructed from the multiple pronunciations of the text corresponding to its speech utterance, we are able to reduce the effort for ph...

متن کامل

Multigram-based grapheme-to-phoneme conversion for LVCSR

Many important speech recognition tasks feature an open, constantly changing vocabulary. (E.g. broadcast news transcription, spoken document retrieval, . . . ) Recognition of (new) words requires acoustic baseforms for them to be known. Commonly words are transcribed manually, which poses a major burden on vocabulary adaptation and interdomain portability. In this work we investigate the possib...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Design and Implementation of the Slovenian Phonetic and Morphology Lexicons for the Use in Spoken Language Applications

Phonetic and Morphology Lexicons that can be used in Spoken Language Applications are costly and time-consuming to build. This paper reports on a project aiming at the semi-automatic development of large phonetic (SIflex) and morphology (SImlex) lexicons for Slovenian language. The main goal of the project is to build the phonetic and morphology lexicon for Slovenian language that will be used ...

متن کامل

Generating proper name pro for automatic speech

Generating correct pronunciation of proper names remains one of the most difficult tasks in text-to-phoneme transcription. Although phonetic rules can be efficient in processing proper names of one language, foreign family names cannot be always correctly generated without additional pronunciation rules. The present study addresses the problem of pronunciation variants for French and foreign fa...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Data Driven Approaches to Phonetic Transcription with Integration of Automatic Speech Recognition and Grapheme-to-Phoneme for Spoken Buddhist Sutra

نویسندگان

چکیده

منابع مشابه

The multiple pronunciations in Taiwanese and the automatic transcription of Buddhist sutra with augmented read speech

Multigram-based grapheme-to-phoneme conversion for LVCSR

Allophone-based acoustic modeling for Persian phoneme recognition

Design and Implementation of the Slovenian Phonetic and Morphology Lexicons for the Use in Spoken Language Applications

Generating proper name pro for automatic speech

عنوان ژورنال:

اشتراک گذاری