Learning the parameters of quantitative prosody models
نویسندگان
چکیده
The article introduces a novel hybrid data driven and rule based approach for the prosody control in a TTS system, which combines the advantages of well-balanced, quantitative models with the flexible training of derived model parameters. Instancing the training of Fujisaki intonation parameters for German (MFGI) the article describes the hybrid data driven and rule based architecture HYDRA, the speech database, the extraction of the model parameters and the neural network (NN) training of these parameters. Preliminary results using the hybrid intonation model are presented. A hybrid neural network and rule based, quantitative model can be easily parameterized and adapted e.g. for multilingual applications, but has a higher complexity and requires the automatic extraction of the model parameters from a speech database.
منابع مشابه
Evaluating prosody of Mandarin speech for language learning
This paper proposes an approach to automatically evaluate the prosody of Chinese Mandarin speech for language learning. In this approach, we grade the appropriateness of prosody of speech units according to a model speech corpus from a teacher’s voice. To this end, we build two models, which are the prosody model and the scoring model. The prosody model that is built from the teacher’s speech p...
متن کاملA Simplified Method of Learning Underlying Articulatory Pitch Target
Previous research has shown that parameters of the quantitative Target Approximation model (qTA) proposed by Prom-on and Xu can be directly extracted from natural speech with high accuracy through analysis-by-synthesis implemented in PENTAtrainers. While this may raise the possibility that PENTAtrainers actually simulate natural acquisition of prosody production, it is questionable that the hum...
متن کاملArtificial Neural Network Based Prosody Models for Finnish Text-to-Speech Synthesis
This thesis presents a series of experiments conducted on Finnish prosody for text-to-speech synthesis using artificial neural networks. The study serves the purpose of mapping and extracting out the relevant factors that have an effect on prosody in general – be they phonetic or linguistic in nature. The interplay between the relevant factors and the behavior of the prosodic parameters range f...
متن کاملGoethe for prosody
In this paper, we describe the way in which a recording of Goethe’s “Die Leiden des jungen Werther” published on a multimedia CDROM [7] was made accessible for prosody research. The recording is interesting for prosody research because of its prosodic richness as it displays a large variety of registers and speaking styles. Application areas are: development of prosody models for German TTS, un...
متن کاملModel of Organization Learning in Islamic Azad University
This study aims to present a model of learning organization in Islamic Azad University. It is practical in terms of purpose and quantitative in terms of implementation. At the first step of the research, after analyzing the information, using inductive content analysis, 15 components were identified and were categorized into 5 dimensions of learning levels, systematic thinking, shared vis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000