Time and space-efficient architecture for a corpus-based text-to-speech synthesis system
نویسندگان
چکیده
This paper proposes a time and space efficient architecture for a text-to-speech synthesis system (TTS). The proposed architecture can be efficiently used in those applications with unlimited domain, requiring multilingual or polyglot functionality. The integration of a queuing mechanism, heterogeneous graphs and finite-state machines gives a powerful, reliable and easily maintainable architecture for the TTS system. Flexible and language-independent frameworks efficiently integrate all those algorithms used within the scope of the TTS system. Heterogeneous relation graphs are used for linguistic information representation and feature construction. Finite-state machines are used for time and space efficient representation of language resources, for time and space efficient lookup processes, and the separation of language-dependent resources from a language-independent TTS engine. Its queuing mechanism consists of several dequeue data structures and is responsible for the activation of all those TTS engine modules having to process the input text. In the proposed architecture, all modules use the same data structure for gathering linguistic information about input text. All input and output formats are compatible, the structure is modular and interchangeable, it is easily maintainable and object oriented. The proposed architecture was successfully used when implementing the Slovenian PLATTOS corpus-based TTS system, as presented in this paper.
منابع مشابه
Corpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملپیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی
Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...
متن کاملMulti-domain text-to-speech synthesis by automatic text classification
This paper describes a multi-domain text-to-speech (MD-TTS) synthesis strategy for generating speech among different domains and so increasing the flexibility of high quality TTS systems. To that effect, the MD-TTS introduces a flexible TTS architecture that includes an automatic domain classification module, which allows MD-TTS systems to be implemented by different synthesis strategies and sp...
متن کاملSlovenian Text-to-Speech Synthesis for Speech User Interfaces
The paper presents the design concept of a unitselection text-to-speech synthesis system for the Slovenian language. Due to its modular and upgradable architecture, the system can be used in a variety of speech user interface applications, ranging from server carrier-grade voice portal applications, desktop user interfaces to specialized embedded devices. Since memory and processing power requi...
متن کاملOverview of SHRC - Ginkgo speech synthesis system for Blizzard Challenge 2013
This paper introduces the SHRC-Ginkgo speech synthesis system for Blizzard Challenge 2013. A unit selection based approach is adopted to develop our speech synthesis system using audiobook speech corpus. Aiming at roughly labeled corpora with several hundred hours of speech, our system adopts lightlysupervised acoustic model training of speech recognition to select clean speech data with accura...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 49 شماره
صفحات -
تاریخ انتشار 2007