Experimental Evaluation of Tree-Based Algorithms for Intonational Breaks Representation
نویسندگان
چکیده
The prosodic specification of an utterance to be spoken by a Textto-Speech synthesis system can be devised in break indices, pitch accents and boundary tones. In particular, the identification of break indices formulates the intonational phrase breaks that affect all the forthcoming prosody-related procedures. In the present paper we use tree-structured predictors, and specifically the commonly used in similar tasks CART and the introduced C4.5 one, to cope with the task of break placement in the presence of shallow textual features. We have utilized two 500-utterance prosodic corpora offered by two Greek universities in order to compare the machine learning approaches and to argue on the robustness they offer for Greek break modeling. The evaluation of the resulted models revealed that both approaches were positively compared with similar works published for other languages, while the C4.5 method accuracy scaled from 1% to 2,7% better than CART.
منابع مشابه
Bayesian induction of intonational phrase breaks
For the present paper, a Bayesian probabilistic framework for the task of automatic acquisition of intonational phrase breaks was established. By considering two different conditional independence assumptions, the naïve Bayes and Bayesian networks approaches were regarded and evaluated against the CART algorithm, which has been previously used with success. A finite length window of minimal mor...
متن کاملIntonational phrase break prediction using decision tree and n-gram model
In the current study, we propose and evaluate a new method for automatic intonational phrase break prediction based on sequences of parts-of-speech and word junctures. The proposed method uses decision trees to estimate the probability of a word juncture type (break or non-break) given a finite length window of part-of-speech values, and uses an n-gram to model the word juncture sequence. Train...
متن کاملLearning prosodic features using a tree representation
We describe experiments designed to learn associations between two types of intonational features, pitch accent and phrasing, from a tree-based corpus annotated with various intonational and syntactic features, for a concept-to-speech system. We show that using novel tree-based features improves the quality of boundary prediction over using only the linear orderbased features normally used in t...
متن کاملA New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...
متن کاملA prosodic phrasing model for a Korean text-to-speech synthesis system
This paper presents a prosodic phrasing model for Korean to be used in a textto-speech synthesis (TTS) system. Read text corpora were morpho-syntactically parsed and prosodically labeled following the Penn Korean Treebank [Han et al., 2002] and K-ToBI prosodic labeling conventions [Sun-Ah, 2000] respectively. Decision trees were trained with morpho-syntactic and textual distance features to pre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005