نتایج جستجو برای: word clustering
تعداد نتایج: 205729 فیلتر نتایج به سال:
This study accounts for Korean /n/-epenthesis from a usage-based perspective, by describing the reduced productivity of epenthesis as an analogical change in progress. We found that epenthesis probability rises as whole-word frequency increases, supporting the hypothesis that analogical change begins in lowfrequency words (Bybee 2002). We interpret the findings as support for the idea that freq...
In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the ...
We applied different clustering algorithms to the task of clustering multi-word terms in order to reflect a humanly built ontology. Clustering was done without the usual document co-occurrence information. Our clustering algorithm, CPCL (Classification by Preferential Clustered Link) is based on general lexico-syntactic relations which do not require prior domain knowledge or the existence of a...
Context Dependent Units are broadly used in Continuous Speech Recognition (CSR) system, being decision trees a suitable clustering technique to obtain this kind of units. This work was aimed to extend the decision tree based clustering to model inter-word context dependencies in Spanish CSR tasks. We first used a set of previously defined context dependent units to model word boundaries. A deci...
In this paper we introduce the vector space representation of the N-gram language model where vectors of K dimensions are given to both words and contexts, i.e., an N-1 word sequence, so that the scalar product of a ‘word vector’ and a ‘context vector’ gives the corresponding N-gram probability. The vector space representation is obtained from singular value decomposition (SVD) of the co-occurr...
High dimensionality of text can be a deterrent in applying complex learners such as Support Vector Machines to the task of text classification. Feature clustering is a powerful alternative to feature selection for reducing the dimensionality of text data. In this paper we propose a new informationtheoretic divisive algorithm for feature/word clustering and apply it to text classification. Exist...
This paper presents an unsupervised method for choosing the correct translation of a word in context. It learns disambiguation information from nonparallel bilinguM corpora (preferably in the same domain) free from tagging. Our method combines two existing unsupervised disambiguation algorithms: a word sense disambiguation algorithm based on distributional clustering and a translation disambigu...
Syntactically annotated data like a treebank are used for training the statistical parsers. One of the main aspects in developing statistical parsers is their sensitivity to the training data. Since data sparsity is the biggest challenge in data oriented analyses, parsers have a malperformance if they are trained with a small set of data, or when the genre of the training and the test data are ...
This paper systematically compares unsupervised word sense discrimination techniques that cluster instances of a target word that occur in raw text using both vector and similarity spaces. The context of each instance is represented as a vector in a high dimensional feature space. Discrimination is achieved by clustering these context vectors directly in vector space and also by finding pairwis...
We use a clustering signature, based on a recently introduced generalization of the clustering coefficient to directed networks, to analyze 16 directed real-world networks of five different types: social networks, genetic transcription networks, word adjacency networks, food webs, and electric circuits. We show that these five classes of networks are cleanly separated in the space of clustering...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید