Synthesis by generation and concatenation of multiform segments

نویسندگان

Vincent Pollet

Andrew P. Breen

چکیده

Machine generated speech can be produced in different ways however there are two basic methods for synthesizing speech in widespread use. One method generates speech from models, while the other method concatenates pre-stored speech segments. This paper presents a speech synthesis technique where these two basic synthesis methods are combined in a statistical framework. Synthetic speech is constructed by generation and concatenation of so-called “multiform segments”. Multiform segments are different speech signal representations; synthesis models, templates and synthesis models augmented with template information. An evaluation of the multiform segment synthesis technique shows improvements over traditional concatenative methods of synthesis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utilization of an HMM-based feature generation module in 5 ms segment concatenative speech synthesis

, – Spectrum at each segment boundary for calculation of concatenation cost (2) Synthesis stage – Text-to-Feature •Generate features from input text (linguistic/prosodic-information) – Feature-to-Speech • Find the N-best candidates in each frame (preselection) according to segment's target cost • Find the best path from the N-best candidates based on concatenation cost •Concatenate the segments...

متن کامل

Synthesis Units for Conversational Speech - Using Phrasal Segments -

This paper describes the use of phrase-sized segments for the concatenative synthesis of conversational speech and discusses the differences in selection criteria that become necessary when the source corpus contains several years of conversational speech samples. It claims that naturalsounding conversational speech can be reproduced by use of such phrase-sized chunks for concatenation, and tha...

متن کامل

Psychoacoustic Segment Scoring for Multi-Form Speech Synthesis

In multi-form segment synthesis, output speech is constructed by splicing waveform segments with statistically modeled and regenerated parametric speech segments. The fraction of model-derived segments is called model-template ratio. The motivation of this work is to further increase flexibility of multi-form synthesis maintaining high speech quality for high model-template ratios. An approach ...

متن کامل

Spectral smoothing for concatenative speech synthesis

This paper addresses the topic of performing e ective concatenative speech synthesis with a limited database by proposing methods to smooth the transitions between speech segments. The objective is to produce naturalsounding speech via segment concatenation when formants and other spectral features do not align properly. We propose several methods for adjusting the spectra between waveform segm...

متن کامل

Simple designing methods of corpus-based visual speech synthesis

This paper describes simple designing methods of corpus-based visual speech synthesis. Our approach needs only a synchronous real image and speech database. Visual speech is synthesized by concatenating real image segments and speech segments selected from the database. In order to automatically perform all processes, e.g. feature extraction, segment selection and segment concatenation, we simp...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Synthesis by generation and concatenation of multiform segments

نویسندگان

چکیده

منابع مشابه

Utilization of an HMM-based feature generation module in 5 ms segment concatenative speech synthesis

Synthesis Units for Conversational Speech - Using Phrasal Segments -

Psychoacoustic Segment Scoring for Multi-Form Speech Synthesis

Spectral smoothing for concatenative speech synthesis

Simple designing methods of corpus-based visual speech synthesis

عنوان ژورنال:

اشتراک گذاری