classification of text documents

Using complex networks for text classification: Discriminating informative and imaginative documents

Journal: :EPL (Europhysics Letters) 2016

Text Classification on a Grid Environment

2010

Valeriana G. Roncero Myrian C. A. Costa Nelson F. F. Ebecken

The enormous amount of information stored in unstructured texts cannot simply be used for further processing by computers, which typically handle text as simple sequences of character strings. Text mining is the process of extracting interesting information and knowledge from unstructured text. One key difficulty with text classification learning algorithms is that they require many hand-labele...

متن کامل

Feature Selection Technique for Text Document Classification: An Alternative Approach

2014

S. W. Mohod

Text classification and feature selection plays an important role for correctly identifying the documents into particular category, due to the explosive growth of the textual information from the electronic digital documents as well as world wide web. In the text mining present challenge is to select important or relevant feature from large and vast amount of features in the data set. The aim o...

متن کامل

MULTILAYER ADAPTIVE FUZZY PROBABILISTIC NEURAL NETWORK IN CLASSIFICATION PROBLEMS OF TEXT DOCUMENTS

Journal: :Radio Electronics, Computer Science, Control 2014

متن کامل

Using micro-documents for feature selection: The case of ordinal text classification

Journal: :Expert Systems with Applications 2013

متن کامل

KNN based Machine Learning Approach for Text and Document Mining

2014

Vishwanath Bijalwan Vinay Kumar Pinki Kumari Jordan Pascual

Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a single-label classification task; otherwise, it is a multi-label classification task. TC uses several tools from Information Retrieval (IR) and Machine Learni...

متن کامل

Machine learning approach for text and document mining

Journal: :CoRR 2014

Vishwanath Bijalwan Pinki Kumari Jordan Pascual Vijay Bhaskar Semwal

Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a single-label classification task; otherwise, it is a multi-label classification task. TC uses several tools from Information Retrieval (IR) and Machine Learni...

متن کامل

Multi-label Text Classification of German Language Medical Documents

2007

Stephan Spat Bruno Cadonna Ivo Rakovac Christian Gütl Hubert Leitner Günther Stark Peter Beck

and Objective Nearly at every patient visit medical documents are produced and stored in a medical record, often in unstructured form as free text. Growing amount of stored documents increases the need for effective and timely retrieval of information. We developed a multi-label classification system to categorize German language free text medical documents (e.g. discharge letters, clinical fin...

متن کامل

Learning to Classify Text from Labeled and Unlabeled Documents

1998

Kamal Nigam Andrew McCallum Sebastian Thrun Tom M. Mitchell

In many important text classification problems, acquiring class labels for training documents is costly, while gathering large quantities of unlabeled data is cheap. This paper shows that the accuracy of text classifiers trained with a small number of labeled documents can be improved by augmenting this small training set with a large pool of unlabeled documents. We present a theoretical argume...

متن کامل

Scalable text classification as a tool for personalization

Journal: :Comput. Syst. Sci. Eng. 2009

Ioannis Antonellis Christos Bouras Vassilis Poulopoulos

We consider scalability issues of the text classification problem where by using (multi)-labeled training documents, we try to build classifiers that assign documents into classes permitting classification in multiple classes. A new class of classification problems; called ‘scalable’, is introduced, with applications on web mining. Scalable classification utilizes newly classified instances in ...

متن کامل