classification of text documents

نتایج جستجو برای: classification of text documents

تعداد نتایج: 21200175 فیلتر نتایج به سال:

Unsupervised Text Classification Using Kohonen's Self Organizing Network

2005

Nirmalya Chowdhury Diganta Saha

A text classification method using Kohonen’s Self Organizing Network is presented here. The proposed method can classify a set of text documents into a number of classes depending on their contents where the number of such classes is not known a priori. Text documents from various faculties of games are considered for experimentation. The method is found to provide satisfactory results for larg...

متن کامل

Classification of Text Documents Based on Minimum System Entropy

2003

Raghu Krishnapuram Krishna Prasad Chitrapura Sachindra Joshi

In this paper, we describe a new approach to classification of text documents based on the minimization of system entropy, i.e., the overall uncertainty associated with the joint distribution of words and labels in the collection. The classification algorithm assigns a class label to a new document in such a way that its insertion into the system results in the maximum decrease (or least increa...

متن کامل

Automatic Text Summarization with Genetic Algorithm-Based Attribute Selection

2004

Carlos Nascimento Silla Gisele L. Pappa Alex Alves Freitas Celso A. A. Kaestner

The task of automatic text summarization consists of generating a summary of the original text that allows the user to obtain the main pieces of information available in that text, but with a much shorter reading time. This is an increasingly important task in the current era of information overload, given the huge amount of text available in documents. In this paper the automatic text summariz...

متن کامل

Integrating image data into biomedical text categorization

Journal: :Bioinformatics 2006

Hagit Shatkay Nawei Chen Dorothea Blostein

Categorization of biomedical articles is a central task for supporting various curation efforts. It can also form the basis for effective biomedical text mining. Automatic text classification in the biomedical domain is thus an active research area. Contests organized by the KDD Cup (2002) and the TREC Genomics track (since 2003) defined several annotation tasks that involved document classific...

متن کامل

Incorporating Latent Semantic Indexing into Spectral Graph Transducer for Text Classification

2008

Xinyu Dai Baoming Tian Junsheng Zhou Jiajun Chen

Spectral Graph Transducer(SGT) is one of the superior graph-based transductive learning methods for classification. As for the Spectral Graph Transducer algorithm, a good graph representation for data to be processed is very important. In this paper, we try to incorporate Latent Semantic Indexing(LSI) into SGT for text classification. Firstly, we exploit LSI to represent documents as vectors in...

متن کامل

Neural Net Learning Issues in Classification of Free Text Documents

2002

Venu Dasigi Reinhold C. Mann

In intelligent analysis of large amounts of text, not any single clue indicates reliably that a pattern of interest has been found. When using multiple clues, it is not known how these should be integrated into a decision. In the context of this investigation, we have been using neural nets as parameterized mappings that allow for fusion of higher level clues extracted from free text. By using ...

متن کامل

Classification of text documents supervised by domain ontologies

2013

Anna Rozeva

The research objective is to establish an approach for supporting the classification of text documents referring to a specified domain. The focus is on the preliminary topic assignment to the documents used for training the model. The method implements domain ontology as background knowledge. The idea consists in extracting the preliminary topics for training the classifier by means of unsuperv...

متن کامل

Automatic Dating Of Documents And Temporal Text Classification

2006

Angelo Dalli Yorick Wilks

The frequency of occurrence of words in natural languages exhibits a periodic and a non-periodic component when analysed as a time series. This work presents an unsupervised method of extracting periodicity information from text, enabling time series creation and filtering to be used in the creation of sophisticated language models that can discern between repetitive trends and non-repetitive w...

متن کامل

Towards Multi Label Text Classification through Label Propagation

2012

Shweta C. Dharmadhikari Maya Ingle Parag Kulkarni

Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text cla...

متن کامل

Automatic Documents Annotation by Keyphrase Extraction in Digital Libraries using Taxonomy

2011

Iram Fatima Asad Masood Khattak Young-Koo Lee Sungyoung Lee

Keyphrases are useful for variety of purposes including: text clustering, classification, content-based retrieval, and automatic text summarization. A small amount of documents have author-assigned keyphrases. Manual assignment of the keyphrases to existing documents is a tedious task, therefore, automatic keyphrase extraction has been extensively used to organize documents. Existing automatic ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید