نتایج جستجو برای: text clustering

تعداد نتایج: 264479  

Journal: :Decision Support Systems 1999
Dmitri Roussinov Hsinchun Chen

In this article, we report our implementation and comparison of two text clustering techniques. One is based on Ward’s clustering and the other on Kohonen’s Self-organizing Maps. We have evaluated how closely clusters produced by a computer resemble those created by human experts. We have also measured the time that it takes for an expert to ‘‘clean up’’ the automatically produced clusters. The...

2002
Steffen Staab Andreas Hotho

Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. In this paper, we propose a new approach for applying background knowledg...

2011
Jiang Zhong Gaofeng Dong Ying Zhou Xue Li Longhai Liu Qiang Chen Huaxiang Zhang

In this paper, an active learning method which can effectively select pairwise constraints during clustering procedure was presented. A novel semi-supervised text clustering algorithm was proposed, which employed an effective pairwise constraints selection method. As the samples on the fuzzy boundary are far away from the cluster center in the clustering procedure, they can be easily divided in...

2008
Diana Inkpen Marc Stogaitis François DeGuire Muath Alzghool

This paper presents the first participation of the University of Ottawa group in the Photo Retrieval task at Image CLEF 2008. Our system uses the following components: Lucene for text indexing and LIRE for image indexing. We experiment with several clustering methods in order to retrieve images from diverse clusters. The clustering methods are: k-means clustering, hierarchical clustering, and o...

2006
Liping Jing Lixin Zhou Michael K. Ng Joshua Zhexue Huang

Recent work has shown that ontologies are useful to improve the performance of text clustering. In this paper, we present a new clustering scheme on the basis of ontologies-based distance measure. Before implementing clustering process, term mutual information matrix is calculated with the aid of Wordnet and some methods of learning ontologies from textual data. Combining this mutual informatio...

2017
Avory C Bryant Krzysztof J Cios

A streaming data clustering algorithm is presented building upon the density-based self-organizing stream clustering algorithm SOSTREAM. Many density-based clustering algorithms are limited by their inability to identify clusters with heterogeneous density. SOSTREAM addresses this limitation through the use of local (nearest neighbor-based) density determinations. Additionally, many stream clus...

Journal: :JIPS 2008
Taeho Jo

This research proposes a new strategy where documents are encoded into string vectors and modified version of k means algorithm to be adaptable to string vectors for text clustering. Traditionally, when k means algorithm is used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classifi...

2013
Gopal Patidar Anju Singh Ph.D Divakar Singh

Clustering is a useful method that categorizes a large quantity of unordered text documents into a small number of meaningful and coherent collections, thereby providing a basis for instinctive and informative navigation and browsing mechanisms. Different type of distance functions and similarity measures have been used for clustering, such as squared, cosine similarity, Euclidean distance and ...

2012
Shimo Shen Zuqiang Meng

K-means algorithm is a relatively simple and fast gather clustering algorithm. However, the initial clustering center of the traditional k-means algorithm was generated randomly from the dataset, and the clustering result was unstable. In this paper, we propose a novel method to optimize the selection of initial centroids for k-means algorithm based on the small world network. This paper firstl...

2003
Andreas Hotho Steffen Staab Gerd Stumme

Text document clustering plays an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. The bag of words representation used for these clustering methods is often unsatisfactory as it ignores relationships between important terms that do not co-occur literally. In order to deal with the pro...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید