Automatic segmentation for czech concatenative speech synthesis using statistical approach with boundary-specific correction

نویسندگان

  • Jindrich Matousek
  • Daniel Tihelka
  • Josef Psutka
چکیده

This paper deals with the problems of automatic segmentation for the purposes of Czech concatenative speech synthesis. Statistical approach to speech segmentation using hidden Markov models (HMMs) is applied in the baseline system. Several improvements of this system are then proposed to get more accurate segmentation results. These enhancements mainly concern the various strategies of HMM initialization (flat-start initialization, hand-labeled or speaker independent HMM bootstrapping). Since HTK, the hidden Markov model toolkit, was utilized in our work, a correction of the output boundary placements is proposed to reflect speech parameterization mechanism. An objective comparison of various automatic methods and manual segmentation is performed to find out the best method. The best results were obtained for boundary-specific statistical correction of the segmentation that resulted from bootstrapping with hand-labeled HMMs (96% segmentation accuracy in tolerance region 20 ms).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experiments with Automatic Segmentation for Czech Speech Synthesis

This paper deals with the automatic segmentation for Czech Concatenative speech synthesis. Statistical approach to speech segmentation using hidden Markov models (HMMs) is applied in the baseline system [1]. Several experiments that concern various issues in the process of building the segmentation system, such as speech parameterization or HMM initialization problems, are described here. An ob...

متن کامل

Automatic speech segmentation with multiple statistical models

In this paper, we propose a novel approach to improve the performance of automatic speech segmentation techniques for concatenative text-to-speech synthesis. A number of automatic segmentation machines (ASMs) are simultaneously applied and the final boundary time marks are drawn from the multiple segmentation results. To identify the best time mark among those provided by the multiple ASMs, we ...

متن کامل

Automatic Segmentation Combining and Spectral Boundary

Currently, AT&T Labs’ Natural Voices multilingual TTS system produces high-quality synthetic speech with a largescale speech corpus [1]. In the development of such systems, automatic segmentation constitutes a major component technology. The prevalent approach for automatic segmentation in speech synthesis is Hidden Markov Model (HMM) based. Even though an HMM-based approach is the most automat...

متن کامل

Towards phone segmentation for concatenative speech synthesis

We present a new approach to solve the problem of phone segmentation when preparing databases for concatenative Text-to-Speech synthesis. First, we describe the problem and review the state of the art. Then we present some already existing techniques to perform this segmentation and present our approach based on a Regression Tree to perform Boundary Specific Correction of the HMM segmentation. ...

متن کامل

Refined speech segmentation for concatenative speech synthesis

High accuracy phonetic segmentation is critical for achieving good quality in concatenative text to speech synthesis. Due to the shortcomings of current automated techniques based on HMM-based alignment or Dynamic Time Warping (DTW), manual verification and labeling are often required. In this paper we present a novel technique for automatic placement of phoneme boundaries in a speech waveform ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003