نتایج جستجو برای: corpora creation
تعداد نتایج: 147847 فیلتر نتایج به سال:
The Old German and the Old Lithuanian Reference Corpus are two deeply-annotated corpora of Old German and Old Lithuanian that are created by enriching the digitized texts with additional data. To reduce conceptual effort and to establish harmonized structures, a coordinated approach was chosen. However, large differences in the availability of resources for annotation, but also in the suitabili...
In this paper we present the procedure for creating a new speech corpus for the Bonn Open Synthesis System (BOSS). BOSS has several advantages which make this procedure particularly straightforward and fast. BOSS is open source, allowing flexible use of components and corpora. It shows a clear separation between data and architecture, which means that a change in corpus does not require a chang...
The essential characteristic of a signed language corpus is that it has been annotated, and not, contrary to the practice of many signed language researchers, that it has been transcribed. Annotations are necessary for corpus-based investigations of signed or spoken languages. Multi-media annotation software can now be used to transform a recording into a machine-readable text without it first ...
For increased speed in developing gigaword language resources for medium resource density languages we integrated several FOSS tools in the HUN* toolkit. While the speed and efficiency of the resulting pipeline has surpassed our expectations, our experience in developing LDC-style resource packages for Uzbek and Kurdish makes clear that neither the data collection nor the subsequent processing ...
The Linguistic Data Consortium (LDC) is a non-profit consortium of universities, companies and government research laboratories that supports education, research and technology development in language related disciplines by collecting or creating, distributing and archiving language resources including data and accompanying tools, standards and formats. LDC was founded in 1992 with a grant from...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید