Report on the lexical representation of subclasses of MWEs
نویسنده
چکیده
This report focuses on various subtypes of MultiWord Expressions (MWEs) and the way they are dealt with in the MWE lexicon for Dutch. The lexicon is developed as part of the STEVIN IRME project,1 which aims at creating an electronic resource of 5,000 Dutch expressions that meets the criterion of being highly theoryand implementationindependent, and which can be used in various Dutch NLP systems. The description of an MWE consists of a list of properties including a pattern name that refers to the description of an MWE pattern. This document solely addresses the description fields that are relevant for the discussion on classes of MWEs. For an elaborate overview of the encoding guidelines of an MWE description and an MWE pattern description, I refer to the Encoding Protocol (Grégoire, 2007). Prior to the overview of subclasses of MWEs in our lexicon, given in section 3, we discuss subclasses and their representation described in related work in section 2. The report ends with a conclusion in section 4.
منابع مشابه
The Effect of Semantic Transfer on Iranian EFL Learners’ Lexical Representation and Processing
متن کامل
Multiword Expressions in NLP: General Survey and a Special Case of Verb-Noun Constructions
This chapter presents a survey of contemporary NLP research on Multiword Expressions (MWEs). MWEs pose a huge problem to precise language processing due to their idiosyncratic nature and diversity of their semantic, lexical, and syntactical properties. The chapter begins by considering MWEs definitions, describes some MWEs classes, indicates problems MWEs generate in language applications and t...
متن کاملA Transition-Based System for Joint Lexical and Syntactic Analysis
We present a transition-based system that jointly predicts the syntactic structure and lexical units of a sentence by building two structures over the input words: a syntactic dependency tree and a forest of lexical units including multiword expressions (MWEs). This combined representation allows us to capture both the syntactic and semantic structure of MWEs, which in turn enables deeper downs...
متن کاملLexical idiosyncrasy in MWE extraction
A wide scale of different NLP methods have been investigated for the extraction of Multiword Expressions from large corpora. While a good deal of recent research has been focusing on the development of reliable means to delineate different subclasses of MWEs with respect to the degree of their compositionality (Baldwin et al., 2003; McCarthy et al., 2003), it has been generally accepted that fo...
متن کاملDesign and Implementation of a Lexicon of Dutch Multiword Expressions
This paper describes the design and implementation of a lexicon of Dutch multiword expressions (MWEs). No exhaustive research on a standard lexical representation of MWEs has been done for Dutch before. The approach taken is innovative, since it is based on the Equivalence Class Method. Furthermore, the selection of the lexical entries and their properties is corpus-based. The design of the lex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007