Learning the parameters of quantitative prosody models

نویسندگان

  • Oliver Jokisch
  • Hansjörg Mixdorff
  • Hans Kruschke
  • Ulrich Kordon
چکیده

The article introduces a novel hybrid data driven and rule based approach for the prosody control in a TTS system, which combines the advantages of well-balanced, quantitative models with the flexible training of derived model parameters. Instancing the training of Fujisaki intonation parameters for German (MFGI) the article describes the hybrid data driven and rule based architecture HYDRA, the speech database, the extraction of the model parameters and the neural network (NN) training of these parameters. Preliminary results using the hybrid intonation model are presented. A hybrid neural network and rule based, quantitative model can be easily parameterized and adapted e.g. for multilingual applications, but has a higher complexity and requires the automatic extraction of the model parameters from a speech database.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating prosody of Mandarin speech for language learning

This paper proposes an approach to automatically evaluate the prosody of Chinese Mandarin speech for language learning. In this approach, we grade the appropriateness of prosody of speech units according to a model speech corpus from a teacher’s voice. To this end, we build two models, which are the prosody model and the scoring model. The prosody model that is built from the teacher’s speech p...

متن کامل

A Simplified Method of Learning Underlying Articulatory Pitch Target

Previous research has shown that parameters of the quantitative Target Approximation model (qTA) proposed by Prom-on and Xu can be directly extracted from natural speech with high accuracy through analysis-by-synthesis implemented in PENTAtrainers. While this may raise the possibility that PENTAtrainers actually simulate natural acquisition of prosody production, it is questionable that the hum...

متن کامل

Artificial Neural Network Based Prosody Models for Finnish Text-to-Speech Synthesis

This thesis presents a series of experiments conducted on Finnish prosody for text-to-speech synthesis using artificial neural networks. The study serves the purpose of mapping and extracting out the relevant factors that have an effect on prosody in general – be they phonetic or linguistic in nature. The interplay between the relevant factors and the behavior of the prosodic parameters range f...

متن کامل

Goethe for prosody

In this paper, we describe the way in which a recording of Goethe’s “Die Leiden des jungen Werther” published on a multimedia CDROM [7] was made accessible for prosody research. The recording is interesting for prosody research because of its prosodic richness as it displays a large variety of registers and speaking styles. Application areas are: development of prosody models for German TTS, un...

متن کامل

Model of Organization Learning in Islamic Azad University

This study aims to present a model of learning organization in Islamic Azad University. It is practical in terms of purpose and quantitative in terms of implementation. At the first step of the research, after analyzing the information, using inductive content analysis, 15 components were identified and were categorized into 5 dimensions of learning levels, systematic thinking, shared vis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000