نتایج جستجو برای: classification of text documents

تعداد نتایج: 21200175  

Journal: :Applied sciences 2022

With the proliferation of mobile devices, amount social media users and online news articles are rapidly increasing, text information is accumulating as big data. As spatio-temporal becomes more important, research on extracting spatiotemporal from data utilizing it for event analysis being actively conducted. However, if that does not describe core subject a document extracted, rather difficul...

2012
Shweta C. Dharmadhikari Maya Ingle Parag Kulkarni

Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text cla...

2012
Mofleh Al-diabat

Text categorization is one of the known problems in classification data mining. It aims to mapping text documents into one or more predefined class or category based on its contents of keywords. This problem has recently attracted many scholars in the data mining and machine learning communities since the numbers of online documents that hold useful information for decision makers, are numerous...

2007
Peter Scheir Philip Hofmair Michael Granitzer Stefanie N. Lindstaedt

In this contribution we present a tool for annotating documents, which are used for workintegrated learning, with concepts from an ontology. To allow for annotating directly while creating or editing an ontology, the tool was realized as a plug-in for the ontology editor Protégé. Annotating documents with semantic metadata is a laborious task, most of the time knowledge representations are crea...

2002
Jihong Guan Shuigeng Zhou

With the rapid growth of online text information, efficient text classification has become one of the key techniques for organizing and processing text repositories. In this paper, an efficient text classification approach was proposed based on pruning training-corpus. By using the proposed approach, noisy and superfluous documents in training corpuses can be cut off drastically, which leads to...

2006
Seyda Ertekin C. Lee Giles

With many web sites appearing everyday, it has become increasingly difficult to keep the web directories up-to-date and growing. The interest in the usage of machine learning on automatic text categorization is further stimulated with this intensive growth of World Wide Web. We believe that Web page classification is significantly different from a traditional text classification because of the ...

Journal: :CoRR 2017
Pavel Král Ladislav Lenc

This paper introduces “Czech Text Document Corpus v 2.0”, a collection of text documents for automatic document classification in Czech language. It is composed of 11,955 text documents provided by the Czech News Agency and is freely available for research purposes at http://home.zcu.cz/ ̃pkral/sw/ . This corpus was created in order to facilitate a straightforward comparison of the document clas...

Journal: :Procesamiento del Lenguaje Natural 2006
Francesc Alías Xavi Gonzalvo Xavier Sevillano Joan Claudi Socoró José Antonio Montero David García

This paper introduces a text classification system tuned to cope with the requirements of multi-domain text-to-speech synthesis. This method, based on a previous system which represents texts by means of a weighted graph, has been developed to improve the classification efficiency for small texts and to minimize its computational cost. To that effect, the comparison space is built from the inpu...

2015
S. W. Mohod

In this paper, a novel approach is proposed for extract eminence features for classifier. Instead of traditional feature selection techniques used for text document classification. We introduce a new model based on probability and over all class frequency of term. We applied this new technique to extract features from training text documents to generate training set for machine learning. Using ...

2009
Houda Benbrahim Max Bramer

Automatic categorization of text documents has become an important area of research in the last two decades, with features that make it significantly more difficult than the traditional classification tasks studied in machine learning. A more recent development is the need to classify hypertext documents, most notably web pages. These have features that add further complexity to the categorizat...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید