From IGT to precision grammar: French verbal morphology
نویسندگان
چکیده
Interlinear glossed text (IGT, the familiar three-line format of linguistic examples) can be an extremely rich source of linguistic information, when linguists follow best practices in creating it (e.g., the Leipzig glossing rules, Comrie et al. 2003). The ODIN project (http://www.csufresno.edu/odin; Lewis 2006) recognized the value of IGT data as a reusable data type and has created a searchable IGT database. This paper represents early efforts in a project to combine aggregations of IGT with a second source of linguistic knowledge to automatically produce implemented formal grammars. The second source of linguistic knowledge is the LinGO Grammar Matrix customization system (Bender et al. 2010). The Grammar Matrix is a multilingual grammar engineering project which includes a cross-linguistic core HPSG (Pollard and Sag 1994) grammar and a set of analyses for cross-linguistically variable phenomena which can be selected via a web-based questionnaire. As an initial pilot study, we focus on verb morphology (including morphotactics and the morphosyntactic effects of affixes) and we begin with a best-case scenario: For our IGT, we use the complete paradigm for the French verb faire (‘to do/make’) provided by Olivier Bonami (pc), including 15,658 phonologically transcribed, morphologically segmented and glossed verb forms.
منابع مشابه
A finite-state morphological analyzer for a Lakota precision grammar
This paper reports on the design and implementation of a morphophonological analyzer for Lakota, a member of the Siouan language family. The initial motivation for this work was to support development of a precision implemented grammar for Lakota on the basis of the LinGO Grammar Matrix. A finite-state transducer (FST) was developed to adapt Lakota’s complex verbal morphology into a form direct...
متن کاملTowards Creating Precision Grammars from Interlinear Glossed Text: Inferring Large-Scale Typological Properties
We propose to bring together two kinds of linguistic resources—interlinear glossed text (IGT) and a language-independent precision grammar resource—to automatically create precision grammars in the context of language documentation. This paper takes the first steps in that direction by extracting major-constituent word order and case system properties from IGT for a diverse sample of languages.
متن کاملExploring Persian Commercials Based on the Halliday’s Systemic-Functional Grammar
Advertisement has long been used as a tool for informing and attracting audiences in different ways. This study aims at investigating the linguistic tools of advertisement in Persian on the basis of Halliday’s systemic-functional grammar theory. The data of this study were gathered from written and verbal commercial advertisements which were recorded and rewritten in order to investigate verbal...
متن کاملJoint Dependency Parsing and Multiword Expression Tokenization
Complex conjunctions and determiners are often considered as pretokenized units in parsing. This is not always realistic, since they can be ambiguous. We propose a model for joint dependency parsing and multiword expressions identification, in which complex function words are represented as individual tokens linked with morphological dependencies. Our graphbased parser includes standard secondo...
متن کاملInducing grammar from IGT
We suggest a strategy for incremental construction of deep parsing grammars from Interlinear Glossed Text (IGT). IGT is a format of representation where standard linguistics and NLP in principle meet, since they are a data-type which is often available for digitally ‘less resourced languages’ (‘LRL’). The IGT database is TypeCraft (Beermann and Mihaylov 2009, www.typecraft.org), and the grammar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012