The Paradigm for Creating Multi-lingual Text-To-Speech Voice Databases

نویسندگان

Min Chu

Yong Zhao

Yining Chen

Lijuan Wang

Frank K. Soong

چکیده

Voice database is one of the most important parts in TTS systems. However, creating a high quality new TTS voice is not an easy task even for a professional team. The whole process is rather complicated and contains plenty minutiae that should be handled carefully. In fact, in many stages, human interference such as manually checking or labeling is necessary. In multi-lingual situations, it is more challenge to find qualified people to do this kind of interference. That’s why most state-of-the-art TTS systems can provide only a few voices. In this paper, we outline a uniform paradigm for creating multi-lingual TTS voice databases. It focuses on technologies that can either improve the scalability of data collection or reduce human interference such as manually checking or labeling. With this paradigm, we decrease the complexity and work load of the task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proceedings of Meetings on Acoustics

India possesses a large variety of languages and dialects spoken in different parts of the country. These languages possess some unique linguistic, phonological and phonetic properties different from European languages. Research is being done in several of Indian languages such as Hindi, Bangla, etc. to study the articulatory, acoustic, Phonetic and prosodic nature for the purpose of creating s...

متن کامل

Development of multi - voice and multi - language Text - to - Speech ( TTS ) and Speech - to - Text ( STT ) conversion system ( languages : Belorussian , Polish , Russian )

This proposal was submitted to INTAS Thematic Call on Information Technology 2004. The participants from 4 countries (Belarus, Poland, Russia and Germany) give efforts to the activity. The overall coordination, monitoring and control of the project will be implemented by Project Coordinator: Prof. Dr. Ruediger Hoffman (Team 1, IAS TUD, Germany) and Decision Board: Prof. Dr. Ruediger Hoffmann (T...

متن کامل

Intra-Lingual and Cross-Lingual Prosody Modelling

Statistical Parametric Speech Synthesis (SPSS) offers flexibility and computational advantage compared to other methods for Text-to-Speech Synthesis. While the speech output is intelligible, statistically trained voices are less natural due to the amount of signal processing and statistical averaging that goes into building the models. Much of the blame for the lack of naturalness falls on the ...

متن کامل

Multi-lingual and Multi-modal Speech Processing and Applications

Over the last decade voice technologies for telephony and embedded solutions became much more mature, resulting in applications providing mobile access to digital information from anywhere. Both a growing demand for voice driven applications in many languages and the need for improved usability and user experience now drives the exploration of multi-lingual speech processing techniques for reco...

متن کامل

مراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی

Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

The Paradigm for Creating Multi-lingual Text-To-Speech Voice Databases

نویسندگان

چکیده

منابع مشابه

Proceedings of Meetings on Acoustics

Development of multi - voice and multi - language Text - to - Speech ( TTS ) and Speech - to - Text ( STT ) conversion system ( languages : Belorussian , Polish , Russian )

Intra-Lingual and Cross-Lingual Prosody Modelling

Multi-lingual and Multi-modal Speech Processing and Applications

مراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی

عنوان ژورنال:

اشتراک گذاری