Automatic Segmentation and Labeling of Speech Corpus Based on HMM with Adaptation
نویسندگان
چکیده
In this article we advise to adopt the adaptive technique of acoustic model in the automatic segmentation and labeling of speech corpus. Since the precision of the data segmentation only based on speaker independent model is not good enough, we should transform the speaker independent model into the speaker dependent one. The training method leading to speaker dependent model needs a large amount of training data and will cost a lot of time, while the adaptive method can modify model parameters to match current speaker in a short time with a few training data and get comparatively precise segmentation results. And at the same time, in order to make the segmentation results more precise, we also combine the boundary adjustment based on the features of acoustics and phonetics and adopt an iterative procedure.
منابع مشابه
Refined speech segmentation for concatenative speech synthesis
High accuracy phonetic segmentation is critical for achieving good quality in concatenative text to speech synthesis. Due to the shortcomings of current automated techniques based on HMM-based alignment or Dynamic Time Warping (DTW), manual verification and labeling are often required. In this paper we present a novel technique for automatic placement of phoneme boundaries in a speech waveform ...
متن کاملRefined Speech Segmentation for Conc
High accuracy phonetic segmentation is critical for achieving good quality in concatenative text to speech synthesis. Due to the shortcomings of current automated techniques based on HMM-based alignment or Dynamic Time Warping (DTW), manual verification and labeling are often required. In this paper we present a novel technique for automatic placement of phoneme boundaries in a speech waveform ...
متن کاملTowards A Phoneme Labeled Mandarin Chinese Speech Corpus
Phoneme level transcription of speech corpora is crucial to fundamental speech research and the increasingly interested detection-based automatic speech recognition. Currently, there is no existing phoneme-labeled Mandarin Chinese speech corpus. This paper presents our recent work towards development of such a corpus. Our goal is to label five hours of speech data selected from a Mandarin Chine...
متن کاملGeneration of Unit Databases for the Upc Text to Speech System
This paper describes a method for the generation of unit databases for concatenative text-to-speech systems. The method comprises the automatic segmentation and pitch synchronous labeling of the units and a selection procedure to extract the best instance per unit from a generic speech corpus. The segmentation is performed by an automatic HMM alignment. The introduction of the demiphone improve...
متن کاملSome Aspects of ASR Transcription Based Unsupervised Speaker Adaptation for HMM Speech Synthesis
Statistical parametric synthesis offers numerous techniques to create new voices. Speaker adaptation is one of the most exciting ones. However, it still requires high quality audio data with low signal to noise ration and precise labeling. This paper presents an automatic speech recognition based unsupervised adaptation method for Hidden Markov Model (HMM) speech synthesis and its quality evalu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000