نتایج جستجو برای: text classification rocchio

تعداد نتایج: 641860  

Journal: :international journal of information, security and systems management 0

text classification is an important research field in information retrieval and text mining. the main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. since word detection is a difficult and time consuming task in persian language, bayesian text classifier is an appropriate approach to deal with different...

2008
Luis M. de Campos Juan M. Fernández-Luna Juan F. Huete Alfonso E. Romero

We propose a simple Bayesian network-based text classifier, which may be considered as a discriminative counterpart of the generative multinomial naive Bayes classifier. The method relies on the use of a fixed network topology with the arcs going form term nodes to class nodes, and also on a network parametrization based on noisy or gates. Comparative experiments of the proposed method with nai...

2002
Kang Hyuk Lee Judy Kay Byeong Ho Kang Uwe Rosebrock

Two main research areas in statistical text categorization are similarity-based learning algorithms and associated thresholding strategies. The combination of these techniques significantly influences the overall performance of text categorization. After investigating two similarity-based classifiers (k-NN and Rocchio) and three common thresholding techniques (RCut, PCut, and SCut), we describe...

2013
Nazlia Omar Mohammed Albared Adel Qasem Al-Shabi Tareq Al-Moslmi

Sentiment Analysis is a very challenging and important task that contains natural language processing, web mining and machine learning. Up to date, few researches have been conducted on sentiment classification for Arabic languages due to the lack of resources for managing sentiments or opinions such as senti-lexicons and opinion corpora. The main obstacle in Arabic sentiment analysis is that p...

2001
Yi Zhang James P. Callan

We used the YFILTER filtering system for experiments on updating profiles and setting thresholds. We developed a new method of using language models for updating profiles that is more focused on picking informative/discriminative words for query. The new method was compared with the well-known Rocchio algorithm. Dissemination thresholds were set based on maximum likelihood estimation that model...

Journal: :Expert Syst. Appl. 2012
Fernando Fernández-Martínez Kseniya Zablotskaya Wolfgang Minker

In this paper we investigate whether conventional text categorization methods may suffice to infer different verbal intelligence levels. This research goal relies on the hypothesis that the vocabulary that speakers make use of reflects their verbal intelligence levels. Automatic verbal intelligence estimation of users in a spoken language dialog system may be useful when defining an optimal dia...

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

2000
Shrikanth Shankar George Karypis

In recent years we have seen a tremendous growth in the volume of text documents available on the Internet, digital libraries, news sources, and company-wide intra-nets. Automatic text categorization, which is the task of assigning text documents to pre-speci ed classes (topics or themes) of documents, is an important task that can help both in organizing as well as in nding information on thes...

2003
Lawrence Shih Jason D. M. Rennie Yu-Han Chang David R. Karger

As text corpora become larger, tradeoffs between speed and accuracy become critical: slow but accurate methods may not complete in a practical amount of time. In order to make the training data a manageable size, a data reduction technique may be necessary. Subsampling, for example, speeds up a classifier by randomly removing training points. In this paper, we describe an alternate method for r...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید