classification of text documents

نتایج جستجو برای: classification of text documents

تعداد نتایج: 21200175 فیلتر نتایج به سال:

A Convolution Neural Network-Based Representative Spatio-Temporal Documents Classification for Big Text Data

Journal: :Applied sciences 2022

With the proliferation of mobile devices, amount social media users and online news articles are rapidly increasing, text information is accumulating as big data. As spatio-temporal becomes more important, research on extracting spatiotemporal from data utilizing it for event analysis being actively conducted. However, if that does not describe core subject a document extracted, rather difficul...

متن کامل

Multi Label Text Classification through Label Propagation

2012

Shweta C. Dharmadhikari Maya Ingle Parag Kulkarni

Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text cla...

متن کامل

Arabic Text Categorization Using Classification Rule Mining

2012

Mofleh Al-diabat

Text categorization is one of the known problems in classification data mining. It aims to mapping text documents into one or more predefined class or category based on its contents of keywords. This problem has recently attracted many scholars in the data mining and machine learning communities since the numbers of online documents that hold useful information for decision makers, are numerous...

متن کامل

The Ontologymapper Plug-in: Supporting Semantic Annotation of Text-documents by Classification

2007

Peter Scheir Philip Hofmair Michael Granitzer Stefanie N. Lindstaedt

In this contribution we present a tool for annotating documents, which are used for workintegrated learning, with concepts from an ontology. To allow for annotating directly while creating or editing an ontology, the tool was realized as a plug-in for the ontology editor Protégé. Annotating documents with semantic metadata is a laborious task, most of the time knowledge representations are crea...

متن کامل

Pruning Training Corpus to Speedup Text Classification1

2002

Jihong Guan Shuigeng Zhou

With the rapid growth of online text information, efficient text classification has become one of the key techniques for organizing and processing text repositories. In this paper, an efficient text classification approach was proposed based on pruning training-corpus. By using the proposed approach, noisy and superfluous documents in training corpuses can be cut off drastically, which leads to...

متن کامل

A Comparative Study on Representation of Web Pages in Automatic Text Categorization

2006

Seyda Ertekin C. Lee Giles

With many web sites appearing everyday, it has become increasingly difficult to keep the web directories up-to-date and growing. The interest in the usage of machine learning on automatic text categorization is further stimulated with this intensive growth of World Wide Web. We believe that Web page classification is significantly different from a traditional text classification because of the ...

متن کامل

Czech Text Document Corpus v 2.0

Journal: :CoRR 2017

Pavel Král Ladislav Lenc

This paper introduces “Czech Text Document Corpus v 2.0”, a collection of text documents for automatic document classification in Czech language. It is composed of 11,955 text documents provided by the Czech News Agency and is freely available for research purposes at http://home.zcu.cz/ ̃pkral/sw/ . This corpus was created in order to facilitate a straightforward comparison of the document clas...

متن کامل

Clasificación de textos adaptada para Conversión de Texto en Habla Multidominio

Journal: :Procesamiento del Lenguaje Natural 2006

Francesc Alías Xavi Gonzalvo Xavier Sevillano Joan Claudi Socoró José Antonio Montero David García

This paper introduces a text classification system tuned to cope with the requirements of multi-domain text-to-speech synthesis. This method, based on a previous system which represents texts by means of a weighted graph, has been developed to improve the classification efficiency for small texts and to minimize its computational cost. To that effect, the comparison space is built from the inpu...

متن کامل

A Novel Approach in Feature Selection Method for Text Document Classification

2015

S. W. Mohod

In this paper, a novel approach is proposed for extract eminence features for classifier. Instead of traditional feature selection techniques used for text document classification. We introduce a new model based on probability and over all class frequency of term. We applied this new technique to extract features from training text documents to generate training set for machine learning. Using ...

متن کامل

Text and Hypertext Categorization

2009

Houda Benbrahim Max Bramer

Automatic categorization of text documents has become an important area of research in the last two decades, with features that make it significantly more difficult than the traditional classification tasks studied in machine learning. A more recent development is the need to classify hypertext documents, most notably web pages. These have features that add further complexity to the categorizat...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید