tfidf vector space model

نتایج جستجو برای: tfidf vector space model

تعداد نتایج: 2616913 فیلتر نتایج به سال:

Classi cation of News Stories Using Support Vector Machines

1999

Robert Cooley

Given a data set and a data mining task such as classiication, there are two main reasons for performing feature space reduction. The rst is to improve the accuracy of the algorithm. In a domain such as text mining, the common technique of parameterizing each document as a vector of words results in thousands of dimensions. The performance of many learning algorithms decreases as the dimensiona...

متن کامل

Categorization of Large Text Collections: Feature Selection for Training Neural Networks

2006

Pensiri Manomaisupat Bogdan Vrusias Khurshid Ahmad

Automatic text categorization requires the construction of appropriate surrogates for documents within a text collection. The surrogates, often called document vectors, are used to train learning systems for categorising unseen documents. A comparison of different measures (tfidf and weirdness) for creating document vectors is presented together with two different state-of-theart classifiers: s...

متن کامل

A Focused Crawler Based on Correlation Analysis

2014

Qiuli Qin Xin Peng

With the rapid development of network and information technology, there is a wealth of huge amounts of data on the internet. But it’s a major problem faced by the majority of researchers how to effectively filter out a particular subject or field of information from these data. In this paper, we try to builder a focused crawler based on vector space model and TFIDF text correlation analysis. We...

متن کامل

Spoken Document Classification Based on Lsh

2013

ZHANG LEI

We present a novel scheme of spoken document classification based on locality sensitive hash because of its ability of solving the approximate near neighbor search in high dimensional spaces. In speechtext conversion stage, although lattice can provide multi-hypothesis during speech recognition, it is too complex to extract proper word information. Confusion network is adopted to improve word r...

متن کامل

Email Vector Space Model (EVSM)

Journal: :International Journal of Intelligent Computing Research 2011

متن کامل

Group-theoretical vector space model

Journal: :International Journal of Computer Mathematics 2014

متن کامل

Lexical-semantic SLVM for XML Document Classification

Journal: :JSW 2014

Jun Long Luda Wang Zude Li Zuping Zhang Huiling Li Guihu Zhao

Structured link vector model (SLVM) and its improved version depend on statistical term measures to implement XML document representation. As a result, they ignore the lexical semantics of terms and its mutual information, leading to text classification errors. This paper proposed a XML document representation method, WordNet-based lexical-semantic SLVM, to solve the problem. Using WordNet, thi...

متن کامل

I Can Guess What You Mean: A Monolingual Query Enhancement for Machine Translation

2016

Chenxi Pang Hai Zhao Zhongyi Li

We introduce a monolingual query method with additional webpage data to improve the translation quality for more and more official use requirement of statistical machine translation outputs. The motivation behind this method is that we can improve the readability of sentence once for all if we replace translation sentences with the most related sentences generated by human. Based on vector spac...

متن کامل

The Vector Space Model

1999

Carolyn J. Crouch

The importance of a thesaurus in the successful operation of an information retrieval system is well recognized. Yet techniques which support the automatic generation of thesauri remain largely undiscovered. This paper describes one approach to the automatic generation of global thesauri, based on the discrimination value model of Salton, Yang, and Yu and on an appropriate clustering algorithm....

متن کامل

MODEL OF CURRENT SPACE VECTOR CONTROL

Journal: :Mokslas - Lietuvos ateitis 2010

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید