A new Japanese TTS system based on speech-prosody database and speech modification

نویسندگان

  • Mitsuaki Isogai
  • Kimihito Tanaka
  • Satoshi Takano
  • Hideyuki Mizuno
  • Masanobu Abe
  • Shin'ya Nakajima
چکیده

This paper describes a new Japanese text-to-speech (TTS) system that can produce highly natural and intelligible synthetic speech. The good performance of the new TTS system derives from three new sophisticated approaches as follows; (1)A new prosody control algorithm that uses prosody data extracted from a natural speech database and a duration control algorithm based on statistical estimation. (2)A new type of synthesis unit that consists of a consonant with following vowel chain. The unit suppresses unnatural sounds and acoustic discontinuities at concatenation points by preparing synthesis units with various lengths and various F0 contours. (3)A new speech modification algorithm with harmonics reconstruction. To evaluate the new modules and the total performance of the new TTS system, listening tests are carried out. The results confirm that the new modules work together effectively, and that the new TTS system can produce high quality synthesized speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing Target Cost Function Based on Prosody of Speech Database

This research aims to construct a high-quality Japanese TTS (Text-to-Speech) system that has high flexibility in treating prosody. Many TTS systems have implemented a prosody control system but such systems have been fundamentally designed to output speech with a standard pitch and speech rate. In this study, we employ a unit selectionconcatenation method and also introduce an analysis-synthesi...

متن کامل

Perceptual Evaluation of Quality Deterioration Owing to Prosody Modification

Our reasearch goal is to construct a Japanese TTS (Text-to-Speech) system that can output various kinds of prosody. Since such synthetic speech is useful for a practical use, many TTS systems have implemented global prosodic control processing. But fundamentally they're designed to output speech with standard pitch and speech rate. We discuss synthesis method for high quality speech with extrem...

متن کامل

Designing Japanese Speech Database Cov for Hybrid Speech Sy

For the purpose of building Text-to-Speech (TTS) system that can generate high-quality and wide range speech in prosody, we conducted speech database construction. As a speech synthesizer, we use a hybrid system which consists of a unit selection module and prosody modification by STRAIGHT (vocoder type high quality analysis-synthesis method). Our viewpoint is to reduce an amount of prosody mod...

متن کامل

Designing speech database with prosodic variety for expressive TTS system

For the purpose of building speech synthesis system that can generate high-quality speech with wide range in prosody and realize fine prosody control, we propose new speech database constructing method. As a speech synthesis method, we select a hybrid system which consists of two part : speech unit selection and prosody modification part by STRAIGHT (vocoder type high quality analysis-synthesis...

متن کامل

Prosodic control in Chinese TTS system

In this paper, the prosodic control strategy is discussed under the collectivity of Chinese TTS system design. A four level (syllable, prosodic word, prosodic phrase and sentence) pitch modification and multiplicative duration model are suggested. Although the prototype of models was formed in 1994, the subsequent results of concerned research based on large speech databases are also represente...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000