A Soft Decision Tree
نویسنده
چکیده
Searching for binary partition of attribute domains is an important task in Data Mining, particularly in decision tree methods. The most important advantage of decision tree methods are based on compactness and clearness of presented knowledge and high accuracy of classification. In case of large data tables, the existing decision tree induction methods often show to be inefficient in both computation and description aspects. The disadvantage of standard decision tree methods is also their instability, i.e., small deviation of data perhaps cause a total change of decision tree. We present the novel ”soft discretization” methods using ”soft cuts” instead of traditional ”crisp” (or sharp) cuts. This new concept allows to generate more compact and stable decision trees with high classification accuracy. We also present an efficient method for soft cut generation from large data bases.
منابع مشابه
Lower Bounds on Quantum Query Complexity for Read-Once Decision Trees with Parity Nodes
We introduce a complexity measure for decision trees called the soft rank, which measures how wellbalanced a given tree is. The soft rank is a somehow relaxed variant of the rank. Among all decision trees of depth d, the complete binary decision tree (the most balanced tree) has maximum soft rank d, the decision list (the most unbalanced tree) has minimum soft rank √ d, and any other trees have...
متن کاملBagging Soft Decision Trees
The decision tree is one of the earliest predictive models in machine learning. In the soft decision tree, based on the hierarchical mixture of experts model, internal binary nodes take soft decisions and choose both children with probabilities given by a sigmoid gating function. Hence for an input, all the paths to all the leaves are traversed and all those leaves contribute to the final decis...
متن کاملSoft context clustering for F0 modeling in HMM-based speech synthesis
This paper proposes the use of a new binary decision tree, which we call a soft decision tree, to improve generalization performance compared to the conventional ‘hard’ decision tree method that is used to cluster context-dependent model parameters in statistical parametric speech synthesis. We apply the method to improve the modeling of fundamental frequency, which is an important factor in sy...
متن کاملOn Exploring Soft Discretization of Continuous Attributes
Searching for a binary partition of attribute domains is an important task in data mining. It is present in both decision tree construction and discretization. The most important advantages of decision tree methods are compactness and clearness of knowledge representation as well as high accuracy of classification. Decision tree algorithms also have some drawbacks. In cases of large data tables...
متن کاملBias - variance tradeo of soft decision trees
This paper focuses on the study of the error composition of a fuzzy decision tree induction method recently proposed by the authors, called soft decision trees. This error may be expressed as a sum of three types of error: residual error, bias and variance. The paper studies empirically the tradeo between bias and variance in a soft decision tree method and compares it with the tradeo of classi...
متن کامل