نتایج جستجو برای: retrieval speed of collocations

تعداد نتایج: 21183877  

2013
Mosharaf Chowdhury

Natural languages are full of word collocations that frequently co-occur and correspond to arbitrary word usages. They appear in both technical and non-technical textual corpora and often have specific significance in individual contexts. Accurately retrieving and identifying collocations from a given corpus in an unsupervised manner is imperative to understanding and automatically generating t...

2002
Fuchun Peng Xiangji Huang Dale Schuurmans Nick Cercone

It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese, the relationship between segmentation and retrieval performance is in fact nonmonotonic; that is, at around 70% word segmentation accuracy an over-segmentation phenomenon begins to occur which leads to a reduction in...

2009
Jirí Materna

In this paper we present a new method of automatic collocation identification. Collocation is an important relation between words, which is widely used, among others, in information retrieval tasks. Over the last years, many methods of automatic collocation acquisition from text corpora have been proposed. The approach described in this paper differs from the others by focusing on domain colloc...

Journal: :Applied Psycholinguistics 2022

Abstract This study examined the processing and acquisition of novel words their collocates (i.e., that frequently co-occur with other words) from reading effect frequency exposure on this process. First second language speakers English read a story 1) eight exposures adjective-pseudoword collocations, 2) four same or 3) control collocations. Results recall recognition tests showed participants...

Journal: :IJIRR 2012
Fethi Fkih Mohamed Nazih Omri

Collocation is defined as a sequence of lexical tokens which habitually co-occur. This type of information is widely used in various applications such as Information Retrieval, document indexing, machine translation, lexicography, etc. Therefore, many techniques are developed for the automatic retrieval of collocations from textual documents. These techniques use statistical measures based on a...

2000
Olga Vechtomova Stephen Robertson

The paper presents a method of combining corpus information on word collocations with the probabilistic model of information retrieval. Corpus term dependencies are used to modify the probabilistic retrieval based on the term independence assumption. Collocates are derived from windows around term occurrences in the corpus. Statistical measures of mutual information and Z score are applied to s...

1998
Susan R. Viscuso

The common tie among these lines of research is that natural language processing techniques offer a way of overcoming the weaknesses inherent to purely statistical approaches. GE pioneered the large-scale use of natural language processing techniques in information retrieval. Standard statistical search methods use words, word fragments, and simple collocations to index documents. The GE work i...

2010
Luka Nerima Eric Wehrli Violeta Seretan

This article discusses the treatment of collocations in the context of a long-term project on the development of multilingual NLP tools. Besides “classical” two-word collocations, we will focus on the case of complex collocations (3 words or more) for which a recursive design is presented in the form of collocation of collocations. Although comparatively less numerous than two-word collocations...

2015
Zhendong Zhao Lan Du Benjamin Börschinger John K. Pate Massimiliano Ciaramita Mark Steedman Mark Johnson

Most existing topic models make the bagof-words assumption that words are generated independently, and so ignore potentially useful information about word order. Previous attempts to use collocations (short sequences of adjacent words) in topic models have either relied on a pipeline approach, restricted attention to bigrams, or resulted in models whose inference does not scale to large corpora...

2008
Tarek A. Elghazaly Aly A. Fahmy

In Cross Language Information Retrieval (CLIR), queries in one language retrieve documents in other language(s). This can be done through Query Translation that comes up against Translation/Transliteration challenges like ambiguity as the main problems. In this paper, a comprehensive solution has been introduced for these challenges. 1, 4 powerful English/Arabic Machine Readable Dictionaries (M...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید