نتایج جستجو برای: classification of text documents

تعداد نتایج: 21200175  

2004
Houda Benbrahim Max Bramer

Hypertext categorization is the automatic classification of web documents into predefined classes. It poses new challenges for automatic categorization because of the rich information in a hypertext document. Hyperlinks, HTML tags, and metadata all provide rich information for hypertext categorization that is not available in traditional text classification. This paper looks at (i) what represe...

2007
Yi Zhang Bing Liu

Traditional text classification studied in the information retrieval and machine learning literature is mainly based on topics. That is, each class or category represents a particular topic, e.g., sports, politics or sciences. However, many real-world problems require more refined classification based on some semantic perspe ctives. For example, in a set of documents about a disease, some docum...

2011
N. Swarna Jyothi M. Sailaja

Text classification refers to determine the class of an unknown text according to its content in the given classification system. In this paper the enhanced features are used to find distribution of a word in a single document or multiple number of documents. It can be exploited by a TF-IDF style equation, and different features are combined using ensemble learning techniques. Features are not ...

2006
Bingru Yang Wei Song Zhangyan Xu

Supervised learning algorithms usually require large amounts of training data to learn reasonably accurate classifiers. Yet, for many text classification tasks, providing labeled training documents is expensive, while unlabeled documents are readily available in large quantities. Learning from both, labeled and unlabeled documents, in a semi-supervised framework is a promising approach to reduc...

2013
Rajni Jindal Shweta Taneja

With the growth of internet, the amount of digital information is growing exponentially day by day. This information may be structured or unstructured in nature. So, a need to convert unstructured text into structured text and to infer knowledge was felt As a result of this, the field of text mining emerged. Text documents may be in the form of online news articles, emails, scientific documents...

2006
Baoping Zhang

Automatic text classification using current approaches is known to perform poorly when documents are noisy or when limited amounts of textual content is available. Yet, many users need access to such documents, which are found in large numbers in digital libraries and in the WWW. If documents are not classified, they are difficult to find when browsing. Further, searching precision suffers when...

2005
Manu Aery Sharma Chakravarthy

Text classification is the problem of assigning pre-defined class labels to incoming, unclassified documents. The class labels are defined based on a set of examples of pre-classified documents used as a training corpus. Various machine learning, information retrieval and probability based techniques have been proposed for text classification. In this paper we propose a novel, graph mining appr...

2017
Vinaitheerthan Renganathan

OBJECTIVES With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. METHODS This paper reviews text mining processes in detail and the software tools a...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه فردوسی مشهد - دانشکده ادبیات و علوم انسانی 1393

abstract: the present study is an attempt to find out cultural exophoric references in iranian high-school elt textbooks and touch stone series to compare the frequency of occurrence of such references in these books. the purpose is to find out which of the series of the books under investigation impose a greater referential burden on efl learners as far as their reading comprehension of the ...

2009
Lachlan Henderson

The growth in the availability of on-line digital text documents has prompted considerable interest in Information Retrieval and Text Classification. Automation of the management of this wealth of textual data is becoming an increasingly important endeavor as the rate of new material continues to grow at its substantial rate. The open directory project (ODP) also known as DMOZ is an on-line ser...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید