classification of text documents

نتایج جستجو برای: classification of text documents

تعداد نتایج: 21200175 فیلتر نتایج به سال:

Text Classification: Forming Candidate Key-Phrases from Existing Shorter Ones

2007

Nikitas N. Karanikolas Christos Skourlas N. N. Karanikolas

The hard problem of the Text Classification usually has various aspects and potential solutions. In this paper, two main research directions for narrative documents’ classification are considered. The first one is based on data mining and rule induction techniques, while the second combines the traditional Text Retrieval techniques (use of the vector space model,

متن کامل

Scalability of Text Classification

2006

Ioannis Antonellis Christos Bouras Vassilis Poulopoulos Anastasios Zouzias

We explore scalability issues of the text classification problem where using (multi)labeled training documents we try to build classifiers that assign documents into classes permitting classification in multiple classes. A new class of classification problems, called ‘scalable’ is introduced that models many problems from the area of Web mining. The property of scalability is defined as the abi...

متن کامل

Text Identification in Noisy Document Images Using Markov Random Field

2003

Yefeng Zheng Huiping Li David S. Doermann

In this paper we address the problem of the identification of text from noisy documents. We segment and identify handwriting from machine printed text because 1) handwriting in a document often indicates corrections, additions or other supplemental information that should be treated differently from the main or body content, and 2) the segmentation and recognition techniques for machine printed...

متن کامل

The Effect of Combining Different Semantic Relations on Arabic Text Classification

2015

Suhad A. Yousif Islam Elkabani Rached Zantout

A massive amount of documents are being posted online every minute. The task of document classification requires extensive background work on the content of documents, where keyword-based matching alone may not be sufficient. Much research has been carried out in several languages that has revealed significant results. However, Arabic documents still pose a great challenge due to the nature of ...

متن کامل

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study

2006

Laila Khreisat

This paper presents the results of classifying Arabic text documents using the N-gram frequency statistics technique employing a dissimilarity measure called the “Manhattan distance”, and Dice’s measure of similarity. The Dice measure was used for comparison purposes. Results show that N-gram text classification using the Dice measure outperforms classification using the Manhattan measure.

متن کامل

Document Classification, a Novel Neural-based Classifier

2011

Seyyed Mohammad Reza Farshchi

The assignment of natural language texts to one or more predefined categories based on their content – is an important component in many information organization and management tasks. This research proposes a novel approach for documents classification with using novel method that combined competitive self organizing neural text categorizer with new vectors that we called, string vectors. Even ...

متن کامل

جایگاه ادبیات در آثار امام صادق(ع)

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه تربیت مدرس 1390

فاطمه توانا, خلیل پروینی, عیسی متقی زاده,

abstract:‎ literature is said beautiful words of poetry or prose that excites reader’s or listener’s feel. certainly, to ‎be effective, such a text should have certain characteristics. the four elements of the literature: thought, ‎imagination, emotion and style make a text to be literary and effective. emotion and imagination are ‎specific elements of literary texts, while thought and style a...

15 صفحه اول

Integrating Attribute-Based Classification for Answer Set Construction

2002

Hyo-Jung Oh Moon-Soo Chang Myung-Gil Jang Sung Hyon Myaeng

With the exponential growth of information on the WWW, it is becoming increasingly difficult to find and organize relevant documents. Automatic text classification has been considered as a solution to the problem with its focus mostly on the subject or content of text [1]. Recently, researchers have realized that user information needs are not just based on the subject of a document but also on...

متن کامل

Hierarchical Text Classification and Evaluation

2001

Aixin Sun Ee-Peng Lim

Hierarchical Classification refers to assigning of one or more suitable categories from a hierarchical category space to a document. While previous work in hierarchical classification focused on virtual category trees where documents are assigned only to the leaf categories, we propose a topdown level-based classification method that can classify documents to both leaf and internal categories. ...

متن کامل

Cross-Lingual Text Categorization

2003

Núria Bel Cornelis H. A. Koster Marta Villegas

This article deals with the problem of Cross-Lingual Text Categorization (CLTC), which arises when documents in different languages must be classified according to the same classification tree. We describe practical and cost-effective solutions for automatic Cross-Lingual Text Categorization, both in case a sufficient number of training examples is available for each new language and in the cas...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید