نتایج جستجو برای: word clustering

تعداد نتایج: 205729  

2015
Lorenzo Albano Domenico Beneventano Sonia Bergamaschi

In [12] a novel approach to Web search result clustering based on Word Sense Induction, i.e. the automatic discovery of word senses from raw text was presented; key to the proposed approach is the idea of, first, automatically inducing senses for the target query and, second, clustering the search results based on their semantic similarity to the word senses induced. In [1] we proposed an innov...

2012
Grzegorz Chrupała

We propose an unsupervised approach to POS tagging where first we associate each word type with a probability distribution over word classes using Latent Dirichlet Allocation. Then we create a hierarchical clustering of the word types: we use an agglomerative clustering algorithm where the distance between clusters is defined as the JensenShannon divergence between the probability distributions...

2012
Grzegorz Chrupala

We propose an unsupervised approach to POS tagging where first we associate each word type with a probability distribution over word classes using Latent Dirichlet Allocation. Then we create a hierarchical clustering of the word types: we use an agglomerative clustering algorithm where the distance between clusters is defined as the JensenShannon divergence between the probability distributions...

2017
João Sedoc Jean Gallier Dean P. Foster Lyle H. Ungar

Vector space representations of words capture many aspects of word similarity, but such methods tend to produce vector spaces in which antonyms (as well as synonyms) are close to each other. For spectral clustering using such word embeddings, words are points in a vector space where synonyms are linked with positive weights, while antonyms are linked with negative weights. We present a new sign...

2010
Minh Quang Nhat Pham Minh Le Nguyen Akira Shimazu

Our research addresses the task of updating legal documents when new information emerges. In this paper, we employ a hierarchical ranking model to the task of updating legal documents. Word clustering features are incorporated to the ranking models to exploit semantic relations between words. Experimental results on legal data built from the United States Code show that the hierarchical ranking...

A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...

Journal: :IEEE Trans. Acoustics, Speech, and Signal Processing 1985
Jay G. Wilpon Lawrence R. Rabiner

Studies of isolated word recognition systems have shown that a set of carefully chosen templates can be used to bring the performance of speaker-independent systems up to that of systems trained to the individual speaker. The earliest work in this area used a sophisticated set of pattern recognition algorithms in a human-interactive mode to create the set of templates (multiple patterns) for ea...

2010
Yangqiu Song Shimei Pan Shixia Liu Furu Wei Michelle X. Zhou Weihong Qian

In this paper, we present a constrained co-clustering approach for clustering textual documents. Our approach combines the benefits of information-theoretic co-clustering and constrained clustering. We use a two-sided hidden Markov random field (HMRF) to model both the document and word constraints. We also develop an alternating expectation maximization (EM) algorithm to optimize the constrain...

Journal: :medical journal of islamic republic of iran 0
mona ebrahimipour ebrahimipour department of speech therapy, school of rehabilitation, iran university of medical sciences, tehran, iran. mohammad reza motamed department of neurology, iran university of medical sciences, tehran, iran. hassan ashayeri department of basic sciences in rehabilitation, school of rehabilitation, iran university of medical sciences, tehran, iran. yahya modarresi department of linguistics, human sciences and cultural education institute, tehran, iran. mohammad kamali department of basic sciences in rehabilitation, iran university of medical sciences, school of rehabilitation sciences, tehran, iran.

background: finding the right word is a necessity in communication, and its evaluation has always been a challenging clinical issue, suggesting the need for valid and reliable measurements. the homophone meaning generation test (hmgt) can measure the ability to switch between verbal concepts, which is required in word retrieval. the purpose of this study was to adapt and validate the persian ve...

2006
Laurent Cicurel Stephan Bloehdorn Philipp Cimiano

In this paper, we propose an approach for constructing clusters of related terms that may be used for deriving formal conceptual structures in a later stage. In contrast to previous approaches in this direction, we explicitly take into account the fact that words can have different, possibly even unrelated, meanings. To account for such ambiguities in word meaning, we consider two alternative s...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید