نتایج جستجو برای: linguistic corpus

تعداد نتایج: 113027  

Journal: :CoRR 2013
Mona T. Diab Nizar Habash Owen Rambow Ryan Roth

2016
Stephanie Strassel Jennifer Tracey

In this paper, we describe the textual linguistic resources in nearly 3 dozen languages being produced by Linguistic Data Consortium for DARPA’s LORELEI (Low Resource Languages for Emergent Incidents) Program. The goal of LORELEI is to improve the performance of human language technologies for low-resource languages and enable rapid re-training of such technologies for new languages, with a foc...

2003
Stephanie Strassel David Miller Kevin Walker Christopher Cieri

This paper describes ongoing efforts at Linguistic Data Consortium to create shared resources for improved speech-totext technology. Under the DARPA EARS program, technology providers are charged with creating STT systems whose outputs are substantially richer and much more accurate than is currently possible. These aggressive program goals motivate new approaches to corpus creation and distrib...

2011
Voula Gotsoulia Bessie Dendrinos

This paper discusses linguistic annotation issues, essential to a corpus-based approach to modelling the language use of foreign language learners in various contexts. We focus on learners of English and describe the corpora we use as well as the linguistic approach underlying their development. We present a scheme for describing grammatical choices and meaning components expressed in texts pro...

2011
Jens Allwood A. P. Hendrikse Elisabeth Ahlsén

The paper deals with words and possible alternative to words as basic units in linguistic theory, especially in interlinguistic comparison and corpus linguistics. A number of ways of defining the word are discussed and related to the analysis of linguistic corpora and to interlinguistic comparisons between corpora of spoken interaction. Problems associated with words as the basic units and alte...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه رازی - پژوهشکده ادبیات 1389

the present study reports an analysis of response articles in four different disciplines in the social sciences, i.e., linguistics, english for specific purposes (esp), accounting, and psychology. the study has three phases: micro analysis, macro analysis, and e-mail interview. the results of the micro analysis indicate that a three-level linguistic pattern is used by the writers in order to cr...

2014
Csaba Oravecz Tamás Váradi Bálint Sass

The paper reports on the development of the Hungarian Gigaword Corpus, an extended new edition of the Hungarian National Corpus, with upgraded and redesigned linguistic annotation and an increased size of 1.5 billion tokens. Issues concerning the standard steps of corpus collection and preparation are discussed with special emphasis on linguistic analysis and annotation due to Hungarian having ...

Journal: :IOP Conference Series: Earth and Environmental Science 2021

2014
Benjamin Kolz Toni Badia Roser Saurí

This paper describes the automatic process of building a dependency annotated corpus based on Ancora constituent structures. The Ancora corpus already has a dependency structure information layer, but the new annotated data applies a purely syntactic orientation and offers in this way a new resource to the linguistic research community. The paper details the process of reannotating the corpus, ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید