نتایج جستجو برای: linguistic corpus

تعداد نتایج: 113027  

2009
Sigita Laurinčiukaitė Mark Filipovič Laimutis Telksnys

This paper presents the development of Lithuanian continuous speech corpus LRN 1 (Lithuanian Radio News, version 1). The corpus was developed from speech corpus LRN 0.1 by increasing the duration of speech corpus (it lasts 20 hours 50 minutes). The major improvement of speech corpus LRN 1 was a development of time-aligned word level annotations of speech signals. Time-aligned word level annotat...

Journal: :Prague Bull. Math. Linguistics 2008
Silvie Cinková Eva Hajicová Jarmila Panevová Petr Sgall

is paper compares the two FGD-based annotation scenarios for Czech and for English, with the Czech as the basis. We discuss the secondary predication expressed by infinitive and its functions in Czech and English, respectively. We give a few examples of English constructions that do not have direct counterparts in Czech (e.g., tough movement and causative constructions with make, get, and have...

2016
Roland Bluhm

The aim of this paper is to discuss the potential benefit of corpus analysis, a (partly) empirical method from linguistics, for philosophy� ‘Corpus analysis’ is not only the name of the method, but also a rough description of it, because the method consists in analysing data taken from linguistic text corpora� In linguistics, using such text corpora is an established practice� A fair number of ...

Journal: :Informatica (Slovenia) 2015
Tomaz Erjavec Nikola Ljubesic Natasa Logar

The availability of large collections of text (language corpora) is crucial for empirically supported linguistic investigations of various languages; however, such corpora are complicated and expensive to collect. In recent years corpora made from texts on the World Wide Web have become an attractive alternative to traditional corpora, as they can be made automatically, contain varied text type...

2015
Yvonne Tsai

This chapter centers on the nuisance caused by passive voice and attributive clauses in student translations. With the use of learner corpus, calculation, categorization, and annotation functions enable analysis of common linguistic features in student translators. The aim of this study is to correct learners’ under-use, over-use, and misuse of terms and linguistic structures. By incorporating ...

2013
Patrick Hanks

It is a truism that meaning depends on context. Corpus evidence now shows us that normal contexts can be summarised and indeed quantified, while the creative exploitations of normal contexts by ordinary language users far exceed anything dreamed up in speculative linguistic theory. Human linguistic behaviour is indeed rule-governed, but in recent years, corpus analysis (e.g. Hanks 2013) has sho...

2012
Kyle Marek-Spartz Paula Chesley Hannah Sande

Large-scale linguistic corpora, complete with information about speakers’ social networks as well as demographic and temporal information, allow for empirical validation of complex theories about the social interactions and linguistic properties leading to large-scale language change. We present ongoing work on the diffusion of lexical innovations using a corpus we have compiled from the Gmane ...

2010
Eline Westerhout

In this paper a combination of linguistic and structural information is used for the extraction of Dutch definitions. The corpus used is a collection of Dutch texts on computing and elearning containing 603 definitions. The extraction process consists of two steps. In the first step a parser using a grammar defined on the basis of the patterns observed in the definitions is applied on the compl...

2011
Vladislav Kubon Markéta Lopatková

The paper deals with the problem of an analysis of complex sentences in Czech on the basis of manually annotated data. The availability of a specialized corpus explicitly describing mutual relationships between segments and clauses in Czech complex sentences, together with the availability of a thoroughly syntactically annotated corpus, the Prague Dependency Treebank, provide a solid background...

2004
Hans van Halteren

A new technique is introduced, linguistic profiling, in which large numbers of counts of linguistic features are used as a text profile, which can then be compared to average profiles for groups of texts. The technique proves to be quite effective for authorship verification and recognition. The best parameter settings yield a False Accept Rate of 8.1% at a False Reject Rate equal to zero for t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید