The Paradigm for Creating Multi-lingual Text-To-Speech Voice Databases
نویسندگان
چکیده
Voice database is one of the most important parts in TTS systems. However, creating a high quality new TTS voice is not an easy task even for a professional team. The whole process is rather complicated and contains plenty minutiae that should be handled carefully. In fact, in many stages, human interference such as manually checking or labeling is necessary. In multi-lingual situations, it is more challenge to find qualified people to do this kind of interference. That’s why most state-of-the-art TTS systems can provide only a few voices. In this paper, we outline a uniform paradigm for creating multi-lingual TTS voice databases. It focuses on technologies that can either improve the scalability of data collection or reduce human interference such as manually checking or labeling. With this paradigm, we decrease the complexity and work load of the task.
منابع مشابه
Proceedings of Meetings on Acoustics
India possesses a large variety of languages and dialects spoken in different parts of the country. These languages possess some unique linguistic, phonological and phonetic properties different from European languages. Research is being done in several of Indian languages such as Hindi, Bangla, etc. to study the articulatory, acoustic, Phonetic and prosodic nature for the purpose of creating s...
متن کاملDevelopment of multi - voice and multi - language Text - to - Speech ( TTS ) and Speech - to - Text ( STT ) conversion system ( languages : Belorussian , Polish , Russian )
This proposal was submitted to INTAS Thematic Call on Information Technology 2004. The participants from 4 countries (Belarus, Poland, Russia and Germany) give efforts to the activity. The overall coordination, monitoring and control of the project will be implemented by Project Coordinator: Prof. Dr. Ruediger Hoffman (Team 1, IAS TUD, Germany) and Decision Board: Prof. Dr. Ruediger Hoffmann (T...
متن کاملIntra-Lingual and Cross-Lingual Prosody Modelling
Statistical Parametric Speech Synthesis (SPSS) offers flexibility and computational advantage compared to other methods for Text-to-Speech Synthesis. While the speech output is intelligible, statistically trained voices are less natural due to the amount of signal processing and statistical averaging that goes into building the models. Much of the blame for the lack of naturalness falls on the ...
متن کاملMulti-lingual and Multi-modal Speech Processing and Applications
Over the last decade voice technologies for telephony and embedded solutions became much more mature, resulting in applications providing mobile access to digital information from anywhere. Both a growing demand for voice driven applications in many languages and the need for improved usability and user experience now drives the exploration of multi-lingual speech processing techniques for reco...
متن کاملمراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی
Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006