نتایج جستجو برای: Stop words

تعداد نتایج: 178858  

This paper discusses recent research on methods for estimating configuration parameters for the Matrix Comparator used for linking unstandardized or heterogeneously standardized references. The matrix comparator computes the aggregate similarity between the tokens (words) in a pair of references. The two most critical parameters for the matrix comparator for obtaining the best linking results a...

Journal: :International Journal of Advanced Computer Science and Applications 2012

2015
Vicenç Parisi Baradad Alexis-Michel Mugabushaka

With the availability of vast collection of research articles on internet, textual analysis is an increasingly important technique in scientometric analysis. While the context in which it is used and the specific algorithms implemented may vary, typically any textual analysis exercise involves intensive pre-processing of input text which includes removing topically uninteresting terms (stop wor...

2012
A. Alajmi E. M. Saad R. R. Darwish C. D. Manning P. Raghavan F. Zou F. L. Wang X. Deng S. Han

Over the past decades systems for automatic management of electronic documents have been one of the main fields of research. Text processing is a wide area that includes many important disciplines. In the processes of organizing unstructured text in order to implement a mining technique, preprocessing has to be applied. One of the most important preprocessing techniques is the removal of functi...

1999
Tin Kam Ho

A recently proposed adaptive strategy for text recognition uses a linguistic fact that over half of the words on a typical English page are among 150 common stop words. The small lexicon permits word-shape based recognition that yields word identities from which character prototypes can be extracted. This paper describes a fast procedure for locating the best candidates for those stop words. Th...

2017
Ibrahim Abu El-Khair

The effectiveness of three stop words lists for Arabic Information Retrieval---General Stoplist, CorpusBased Stoplist, Combined Stoplist ---were investigated in this study. Three popular weighting schemes were examined: the inverse document frequency weight, probabilistic weighting, and statistical language modelling. The Idea is to combine the statistical approaches with linguistic approaches ...

2005
Ton van der Wouden

Spoken language usually precedes language represented in writing. Children know how to speak and listen years before they learn to read and write. The history of language is estimated to be in the order of magnitude of hundreds of thousands of years, the history of writing in thousands of years. There are many language communities without writing, but only in the case of dead languages such as ...

2007
Roi Blanco Alvaro Barreiro

This paper addresses the problem of identifying collection dependent stop-words in order to reduce the size of inverted files. We present four methods to automatically recognise stop-words, analyse the tradeoff between efficiency and effectiveness, and compare them with a previous pruning approach. The experiments allow us to conclude that in some situations stop-words pruning is competitive wi...

Journal: :Informatica 2023

Stop words are very important for information retrieval and text analysis investigation. This study aimed to automatically analyzed detect stop in texts Uzbek language. Because of limited availability methods automatic search we a newly prepared corpus. language belongs the family agglutinative languages. As with all languages, can explain that detection is more complex process than inflected l...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید