نتایج جستجو برای: corpora creation

تعداد نتایج: 147847  

2013

The Old German and the Old Lithuanian Reference Corpus are two deeply-annotated corpora of Old German and Old Lithuanian that are created by enriching the digitized texts with additional data. To reduce conceptual effort and to establish harmonized structures, a coordinated approach was chosen. However, large differences in the availability of resources for annotation, but also in the suitabili...

2001
Esther Klabbers Karlheinz Stöber

In this paper we present the procedure for creating a new speech corpus for the Bonn Open Synthesis System (BOSS). BOSS has several advantages which make this procedure particularly straightforward and fast. BOSS is open source, allowing flexible use of components and corpora. It shows a clear separation between data and architecture, which means that a change in corpus does not require a chang...

2008
Trevor Johnston

The essential characteristic of a signed language corpus is that it has been annotated, and not, contrary to the practice of many signed language researchers, that it has been transcribed. Annotations are necessary for corpus-based investigations of signed or spoken languages. Multi-media annotation software can now be used to transform a recording into a machine-readable text without it first ...

2008
Péter Halácsy András Kornai Péter Németh Dániel Varga

For increased speed in developing gigaword language resources for medium resource density languages we integrated several FOSS tools in the HUN* toolkit. While the speed and efficiency of the resulting pipeline has surpassed our expectations, our experience in developing LDC-style resource packages for Uzbek and Kurdish makes clear that neither the data collection nor the subsequent processing ...

Journal: :Revista Brasileira de Linguística Aplicada 2011

Journal: :Archiv für Pathologische Anatomie und Physiologie und für Klinische Medicin 1854

2000
Christopher Cieri Mark Liberman

The Linguistic Data Consortium (LDC) is a non-profit consortium of universities, companies and government research laboratories that supports education, research and technology development in language related disciplines by collecting or creating, distributing and archiving language resources including data and accompanying tools, standards and formats. LDC was founded in 1992 with a grant from...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید