نتایج جستجو برای: word clustering

تعداد نتایج: 205729  

2003
Erik Velldal

This thesis describes a clustering approach to automatically inferring soft semantic classes and characterizing senses of a set of Norwegian nouns. The words are represented by way of their distribution in text, identified as local contexts in the form of lexical-syntactic relations. Through a shallow processing step the context features are extracted for lemmatized word forms in syntactically ...

2012
M. Eduardo Ares Álvaro Barreiro

In recent years there has emerged the field of Constrained Clustering, which proposes clustering algorithms which are able to accommodate domain information to obtain a better final grouping. This information is usually provided as pairwise constraints, whose acquisition from humans can be costly. In this paper we propose a novel method based on word n-grams to automatically extract positive co...

Journal: :TAL 2009
Salma Jamoussi

RÉSUMÉ. L’idée que nous défendons dans cet article est qu’il est possible d’obtenir des concepts sémantiques significatifs par des méthodes de classification automatique. Pour ce faire, nous commençons par proposer des mesures permettant de quantifier les relations sémantiques entre mots. Ensuite, nous utilisons les méthodes de classification non supervisée pour construire les concepts d’une ma...

2010
Dani Yogatama

This thesis is about multilingual document clustering through estimating semantic relatedness between multilingual texts. Specifically we focus on the task of clustering multilingual documents with very limited or no supervisory information. We present two approaches to address the problem : a comparable-corpora based approach and a web-searches based approach. Our first approach derives pairwi...

2005
Bing Zhao Eric P. Xing Alexander H. Waibel

In this paper, a variant of a spectral clustering algorithm is proposed for bilingual word clustering. The proposed algorithm generates the two sets of clusters for both languages efficiently with high semantic correlation within monolingual clusters, and high translation quality across the clusters between two languages. Each cluster level translation is considered as a bilingual concept, whic...

Journal: :IJPRAI 2015
Guoyu Tang Yunqing Xia Erik Cambria Peng Jin Thomas Fang Zheng

Cross-lingual document clustering is the task of automatically organizing a large collection of multi-lingual documents into a few clusters, depending on their content or topic. It is well known that language barrier and translation ambiguity are two challenging issues for cross-lingual document representation. To this end, we propose to represent cross-lingual documents through statistical wor...

2016
Xuanyi Liao Guang Cheng

This paper intend to present an approach to analyse the change of word meaning based on word embedding, which is a more general method to quantize words than before. Through analysing the similar words and clustering in different period, semantic change could be detected. We analysed the trend of semantic change through density clustering method called DBSCAN. Statics and data visualization is ...

2002
JAY G. WILPON

It is demonstrated that clustering can be a powerful tool for selecting reference templates for speaker-independent word recognition. We describe a set of clustering techniques specifically designed for this purpose. These interactive procedures identify coarse structure, fine structure, overlap of, and outliers from clusters. The techniques have been applied t a large speech data base consisti...

2014
Jiguang Liang Xiaofei Zhou Yue Hu Li Guo Shuo Bai

Word clustering which generalizes specific features cluster words in the same syntactic or semantic categories into a group. It is an effective approach to reduce feature dimensionality and feature sparseness which are clearly useful for many NLP applications. This paper proposes an unsupervised label propagation algorithm (Un-LP) for word clustering which uses multi-exemplars to represent a cl...

2010
Md. Akmal Haidar Douglas D. O'Shaughnessy

A new approach for computing weights of topic models in language model (LM) adaptation is introduced. We formed topic clusters by a hard-clustering method assigning one topic to one document based on the maximum number of words chosen from a topic for that document in Latent Dirichlet Allocation (LDA) analysis. The new weighting idea is that the unigram count of the topic generated by hard-clus...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید