linguistic corpus

نتایج جستجو برای: linguistic corpus

تعداد نتایج: 113027 فیلتر نتایج به سال:

LDC Arabic Treebanks and Associated Corpora: Data Divisions Manual

Journal: :CoRR 2013

Mona T. Diab Nizar Habash Owen Rambow Ryan Roth

متن کامل

LORELEI Language Packs: Data, Tools, and Resources for Technology Development in Low Resource Languages

2016

Stephanie Strassel Jennifer Tracey

In this paper, we describe the textual linguistic resources in nearly 3 dozen languages being produced by Linguistic Data Consortium for DARPA’s LORELEI (Low Resource Languages for Emergent Incidents) Program. The goal of LORELEI is to improve the performance of human language technologies for low-resource languages and enable rapid re-training of such technologies for new languages, with a foc...

متن کامل

Shared resources for robust speech-to-text technology

2003

Stephanie Strassel David Miller Kevin Walker Christopher Cieri

This paper describes ongoing efforts at Linguistic Data Consortium to create shared resources for improved speech-totext technology. Under the DARPA EARS program, technology providers are charged with creating STT systems whose outputs are substantially richer and much more accurate than is currently possible. These aggressive program goals motivate new approaches to corpus creation and distrib...

متن کامل

Towards a Corpus-based Approach to Modelling Language Production of Foreign Language Learners in Communicative Contexts

2011

Voula Gotsoulia Bessie Dendrinos

This paper discusses linguistic annotation issues, essential to a corpus-based approach to modelling the language use of foreign language learners in various contexts. We focus on learners of English and describe the corpora we use as well as the linguistic approach underlying their development. We present a scheme for describing grammatical choices and meaning components expressed in texts pro...

متن کامل

Words and alternative basic units for linguistic analysis

2011

Jens Allwood A. P. Hendrikse Elisabeth Ahlsén

The paper deals with words and possible alternative to words as basic units in linguistic theory, especially in interlinguistic comparison and corpus linguistics. A number of ways of defining the word are discussed and related to the analysis of linguistic corpora and to interlinguistic comparisons between corpora of spoken interaction. Problems associated with words as the basic units and alte...

متن کامل

response articles: micro and macro analysis

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه رازی - پژوهشکده ادبیات 1389

سیده مطهره محمدزاده درزی, مصطفی حسرتی, عامر قیطوری,

the present study reports an analysis of response articles in four different disciplines in the social sciences, i.e., linguistics, english for specific purposes (esp), accounting, and psychology. the study has three phases: micro analysis, macro analysis, and e-mail interview. the results of the micro analysis indicate that a three-level linguistic pattern is used by the writers in order to cr...

15 صفحه اول

The Hungarian Gigaword Corpus

2014

Csaba Oravecz Tamás Váradi Bálint Sass

The paper reports on the development of the Hungarian Gigaword Corpus, an extended new edition of the Hungarian National Corpus, with upgraded and redesigned linguistic annotation and an increased size of 1.5 billion tokens. Issues concerning the standard steps of corpus collection and preparation are discussed with special emphasis on linguistic analysis and annotation due to Hungarian having ...

متن کامل

Utilization of corpus for sustainability of linguistic research

Journal: :IOP Conference Series: Earth and Environmental Science 2021

متن کامل

Linguistic Complexity: English vs. Polish, Text vs. Corpus

Journal: :Acta Physica Polonica A 2010

متن کامل

From constituents to syntax-oriented dependencies De constituyentes a dependencias de base sintáctica

2014

Benjamin Kolz Toni Badia Roser Saurí

This paper describes the automatic process of building a dependency annotated corpus based on Ancora constituent structures. The Ancora corpus already has a dependency structure information layer, but the new annotated data applies a purely syntactic orientation and offers in this way a new resource to the linguistic research community. The paper details the process of reannotating the corpus, ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید