نتایج جستجو برای: post text
تعداد نتایج: 564440 فیلتر نتایج به سال:
A statistical correlation model for image retrieval is proposed. This model captures the semantic relationships among images in a database from simple statistics of userprovided relevance feedback information. It is applied in the post-processing of image retrieval results such that more semantically related images are returned to the user. The algorithm is easy to implement and can be efficien...
We describe the participation of the University of Amsterdam’s ILPS group in the blog track at TREC 2008. We mainly explored different ways of using external corpora to expand the original query. In the blog post retrieval task we did not succeed in improving over a simple baseline (equal weights for both the expanded and original query). Obtaining optimal weights for the original and the expan...
This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of...
SpiderLing—a web spider for linguistics—is new software for creating text corpora from the web, which we present in this article. Many documents on the web only contain material which is not useful for text corpora, such as lists of links, lists of products, and other kind of text not comprised of full sentences. In fact such pages represent the vast majority of the web. Therefore, by doing unr...
Most of the post-processing methods for character recognition rely on contextual information of character and word-fragment levels. However, due to linguistic characteristics of Korean, such low-level information alone is not sufficient for high-quality character-recognition applications, and we need much higher-level contextual information to improve the recognition results. This paper present...
This paper describes LIMSI’s submissions to the shared WMT’15 translation task. We report results for French-English, Russian-English in both directions, as well as for Finnish-into-English. Our submissions use NCODE and MOSES along with continuous space translation models in a post-processing step. The main novelties of this year’s participation are the following: for Russian-English, we inves...
This paper describes LIMSI’s submissions to the shared WMT’15 translation task. We report results for French-English, Russian-English in both directions, as well as for Finnish-into-English. Our submissions use NCODE and MOSES along with continuous space translation models in a post-processing step. The main novelties of this year’s participation are the following: for Russian-English, we inves...
In this paper we present preliminary work conducted on semi-automatic induction of inflectional paradigms from non annotated corpora using the open-source tool Linguistica (Goldsmith 2001) that can be utilized without any prior knowledge of the language. The aim is to induce morphology information from corpora such as to compare languages and foresee the difficulty to develop morphosyntactic le...
Skip-Gram word embeddings, estimated from large text corpora, have been shown to improve many NLP tasks through their highquality features. However, little is known about their robustness against parameter perturbations and about their e ciency in preserving word similarities under memory constraints. In this paper, we investigate three post-processing methods for word embeddings to study their...
This paper presents TweetCaT, an open-source Python tool for building Twitter corpora that was designed for smaller languages. Using the Twitter search API and a set of seed terms, the tool identifies users tweeting in the language of interest together with their friends and followers. By running the tool for 235 days we tested it on the task of collecting two monitor corpora, one for Croatian ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید