نتایج جستجو برای: word clustering

تعداد نتایج: 205729  

1998
Shinsuke Mori Masafumi Nishimura Nobuyasu Itoh

In this paper we describe a word clustering method for class-based n-gram model. The measurement for clustering is the entropy on a corpus di erent from the corpus for n-gram model estimation. The search method is based on the greedy algorithm. We applied this method to a Japanese EDR corpus and English Penn Treebank corpus. The perplexities of word-based n-gram model on EDR corpus and Penn Tre...

2017
Yixin Cao Jiaxin Shi Juan-Zi Li Zhiyuan Liu Chengjiang Li

To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel appro...

2017
Linyuan Tang Kyo Kageura

Previous work on the epistemology of fact-checking indicated the dilemma between the needs of binary answers for the public and ambiguity of political discussion. Determining concepts represented by terms in political discourse can be considered as a Word-Sense Disambiguation (WSD) task. The analysis of political discourse, however, requires identifying precise concepts of terms from relatively...

2010
Lisha Wang Yanzhao Dou Xiaoling Sun Hongfei Lin

This paper details our experiments carried out at Word Sense Induction task. For the foreign language (especially English), there have been many studies of word sense induction (WSI), and the approaches and the techniques are more and more mature. However, the study of Chinese WSI is just getting started, and there has not been a better way to solve the problems encountered. WSI can be divided ...

2014
Kartik Goyal Eduard H. Hovy

Word sense induction is an unsupervised task to find and characterize different senses of polysemous words. This work investigates two unsupervised approaches that focus on using distributional word statistics to cluster the contextual information of the target words using two different algorithms involving latent dirichlet allocation and spectral clustering. Using a large corpus for achieving ...

Journal: :IEICE Transactions 2011
Gibran Fuentes Pineda Hisashi Koga Toshinori Watanabe

We present a scalable approach to automatically discovering particular objects (as opposed to object categories) from a set of images. The basic idea is to search for local image features that consistently appear in the same images under the assumption that such co-occurring features underlie the same object. We first represent each image in the set as a set of visual words (vector quantized lo...

2011
Johanna Geiß

This thesis investigates the applicability of Latent Semantic Analysis (LSA) to sentence clustering for Multi-Document Summarization (MDS). In contrast to more shallow approaches like measuring similarity of sentences by word overlap in a traditional vector space model, LSA takes word usage patterns into account. So far LSA has been successfully applied to different Information Retrieval (IR) t...

2010
Saeedeh Momtazi Sanjeev Khudanpur Dietrich Klakow

Sentence retrieval is a very important part of question answering systems. Term clustering, in turn, is an effective approach for improving sentence retrieval performance: the more similar the terms in each cluster, the better the performance of the retrieval system. A key step in obtaining appropriate word clusters is accurate estimation of pairwise word similarities, based on their tendency t...

2014
Xinying Song Xiaodong He Jianfeng Gao Li Deng

Deep neural network (DNN) based natural language processing models rely on a word embedding matrix to transform raw words into vectors. Recently, a deep structured semantic model (DSSM) has been proposed to project raw text to a continuously-valued vector for Web Search. In this technical report, we propose learning word embedding using DSSM. We show that the DSSM trained on large body of text ...

Journal: :Computational Linguistics 2013
Antonio Di Marco Roberto Navigli

Web search result clustering aims to facilitate information search on the Web. Rather than the results of a query being presented as a flat list, they are grouped on the basis of their similarity and subsequently shown to the user as a list of clusters. Each cluster is intended to represent a different meaning of the input query, thus taking into account the lexical ambiguity (i.e., polysemy) i...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید