text clustering

نتایج جستجو برای: text clustering

تعداد نتایج: 264479 فیلتر نتایج به سال:

Concept Chain Based Text Clustering

2005

Shaoxu Song Jian Zhang Chunping Li

Different from familiar clustering objects, text documents have sparse data spaces. A common way of representing a document is as a bag of its component words, but the semantic relations between words are ignored. In this paper, we propose a novel document representation approach to strengthen the discriminative feature of document objects. We replace terms of documents with concepts in WordNet...

متن کامل

A Chinese phoneme clustering theory and its application to a text independent speaker verification system

1987

Peng Guo Xixian Chen Changnian Cai

This paper presents a new idea of Chinese phoneme clustering and a text independent speaker verification system with this technique applied. It changes the way of conventional verification method with averaging features used, instead, both the dynamic and static features of speech are included in our new method. Also it leads to fast and efficient clustering algorithm in the training phase. The...

متن کامل

A General Bio-inspired Method to Improve the Short-Text Clustering Task

2010

Diego Ingaramo Marcelo Luis Errecalde Paolo Rosso

“Short-text clustering” is a very important research field due to the current tendency for people to use very short documents, e.g. blogs, text-messaging and others. In some recent works, new clustering algorithms have been proposed to deal with this difficult problem and novel bio-inspired methods have reported the best results in this area. In this work, a general bio-inspired method based on...

متن کامل

Entity Clustering Across Languages

2012

Spence Green Nicholas Andrews Matthew R. Gormley Mark Dredze Christopher D. Manning

Standard entity clustering systems commonly rely on mention (string) matching, syntactic features, and linguistic resources like English WordNet. When co-referent text mentions appear in different languages, these techniques cannot be easily applied. Consequently, we develop new methods for clustering text mentions across documents and languages simultaneously, producing cross-lingual entity cl...

متن کامل

Text clustering using frequent itemsets

Journal: :Knowl.-Based Syst. 2010

Wen Zhang Taketoshi Yoshida Xijin Tang Qing Wang

Frequent itemset originates from association rule mining. Recently, it has been applied in text mining such as document categorization, clustering, etc. In this paper, we conduct a study on text clustering using frequent itemsets. The main contribution of this paper is three manifolds. First, we present a review on existing methods of document clustering using frequent patterns. Second, a new m...

متن کامل

Ontologies Improve Text Document Clustering

2003

Andreas Hotho Steffen Staab Gerd Stumme

Text document clustering plays an important role in providing intuitive navigation and browsing mechanisms by organizing large sets of documents into a small number of meaningful clusters. The bag of words representation used for these clustering methods is often unsatisfactory as it ignores relationships between important terms that do not cooccur literally. In order to deal with the problem, ...

متن کامل

A Survey on Optimization Approaches to Text Document Clustering

Journal: :International Journal on Computational Science & Applications 2013

متن کامل

Self-Taught convolutional neural networks for short text clustering

Journal: :Neural Networks 2017

متن کامل

Document Clustering and Text Summarization

2000

Joel Larocca Neto Alexandre D. Santos Celso A.A. Kaestner Alex A. Freitas

This paper describes a text mining tool that performs two tasks, namely document clustering and text summarization. These tasks have, of course, their corresponding counterpart in “conventional” data mining. However, the textual, unstructured nature of documents makes these two text mining tasks considerably more difficult than their data mining counterparts. In our system document clustering i...

متن کامل

Conceptual Clustering of Text Clusters

2002

Andreas Hotho Gerd Stumme

Common clustering techniques have the disadvantage that they do not provide intensional descriptions of the clusters obtained. Conceptual Clustering techniques, on the other hand, provide such descriptions, but are known to be rather slow. In this paper, we discuss a way of combining both techniques. We rst cluster the documents by a variant of k{Means, using a thesaurus as background knowledge...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید