نتایج جستجو برای: text domain

تعداد نتایج: 558891  

2004
Natalia V. Loukachevitch Boris V. Dobrov

In the paper we describe development, means of evaluation and applications of Russian–English Sociopolitical Thesaurus specially developed as a linguistic resource for automatic text processing applications. The Sociopolitical domain is not a domain of social research but a broad domain of social relations including economic, political, military, cultural, sports and other subdomains. The knowl...

2014
Tatiana Tommasi Tinne Tuytelaars

Despite the increasing interest towards domain adaptation and transfer learning techniques to generalize over image collections and overcome their biases, the visual community misses a large scale testbed for cross-dataset analysis. In this paper we discuss the challenges faced when aligning twelve existing image databases in a unique corpus, and we propose two cross-dataset setups that introdu...

2016
Khalid Al Khatib Henning Wachsmuth Johannes Kiesel Matthias Hagen Benno Stein

Many argumentative texts, and news editorials in particular, follow a specific strategy to persuade their readers of some opinion or attitude. This includes decisions such as when to tell an anecdote or where to support an assumption with statistics, which is reflected by the composition of different types of argumentative discourse units in a text. While several argument mining corpora have re...

Journal: :Inf. Process. Manage. 2004
Akira Terada Takenobu Tokunaga Hozumi Tanaka

Unknown words such as proper nouns, abbreviations, and acronyms are a major obstacle in text processing. Abbreviations, in particular, are difficult to read/process because they are often domain-specific. In this paper, we propose a method for automatic expansion of abbreviations by using context and character information. In previous studies dictionaries were used to search for abbreviation ex...

2007
Filip Deprez Jan Odijk Jan De Moortel

This tutorial paper addresses foreign-language support in corpus-based concatenative text-to-speech systems. We give an overview of application domains where strictly monolingual speech synthesis is not sufficient and where multilingual text-to-speech is required or highly desirable. We describe two approaches to multilingual corpus-based speech synthesis: phoneme mapping on the one hand, and t...

2006
Matthew Lease Mark Johnson

This paper evaluates the benefit of deleting fillers (e.g. you know, like) early in parsing conversational speech. Readability studies have shown that disfluencies (fillers and speech repairs) may be deleted from transcripts without compromising meaning (Jones et al., 2003), and deleting repairs prior to parsing has been shown to improve its accuracy (Charniak and Johnson, 2001). We explore whe...

2016
Douglas E. Sturim Pedro A. Torres-Carrasquillo Joseph P. Campbell

The goal of this paper is to describe significant corpora available to support speaker recognition research and evaluation, along with details about the corpora collection and design. We describe the attributes of high-quality speaker recognition corpora. Considerations of the application, domain, and performance metrics are also discussed. Additionally, a literature survey of corpora used in s...

Journal: :International Journal of Mathematics 2022

We obtain a quantitative estimate of Bergman distance when [Formula: see text] is bounded domain with log-hyperconvexity index text], as well the text]-integrability kernel text].

Journal: :CoRR 2017
Bin Bi Hao Ma

This paper proposes a novel neural machine reading model for open-domain question answering at scale. Existing machine comprehension models typically assume that a short piece of relevant text containing answers is already identified and given to the models, from which the models are designed to extract answers. This assumption, however, is not realistic for building a large-scale open-domain q...

2014
Andreas Holzinger Johannes Schantl Miriam Schroettner Christin Seifert Karin M. Verspoor

Text is a very important type of data within the biomedical domain. For example, patient records contain large amounts of text which has been entered in a non-standardized format, consequently posing a lot of challenges to processing of such data. For the clinical doctor the written text in the medical findings is still the basis for decision making – neither images nor multimedia data. However...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید