Alignment-based extraction of multiword expressions

نویسندگان

  • Helena de Medeiros Caseli
  • Carlos Ramisch
  • Maria das Graças Volpe Nunes
  • Aline Villavicencio
چکیده

Due to idiosyncrasies in their syntax, semantics or frequency, Multiword Expressions (MWEs) have received special attention from the NLP community, as the methods and techniques developed for the treatment of simplex words are not necessarily suitable for them. This is certainly the case for the automatic acquisition of MWEs from corpora. A lot of effort has been directed to the task of automatically identifying them, with considerable success. In this paper, we propose an approach for the identification of MWEs in a multilingual context, as a by-product of a word alignment process, that not only deals with the identification of possible MWE candidates, but also associates some multiword expressions with semantics. The results obtained indicate the feasibility and low costs in terms of tools and resources demanded by this approach, which could, for example, facilitate and speed up lexicographic work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integration of Reduplicated Multiword Expressions and Named Entities in a Phrase Based Statistical Machine Translation System

The language specific Multiword expressions (MWEs) play important roles in many natural language processing (NLP) tasks. Integrating reduplicated multiword expressions (RMWEs) into the Phrase Based Statistical Machine Translation (PBSMT) to improve translation quality is reported in the present work between Manipuri, a highly agglutinative Tibeto-Burman language and English. In addition, Multiw...

متن کامل

Johan Segura and Violaine Prince Using Alignment to detect associated multiword expressions in bilingual corpora

Translating multiword expressions from a language to another needs to recognize them as such. Bilingual multiword expressions are an issue when they are not the exact word-toword translation of each other. The following examples are provided for a French-English translation task: (1) Phrasal verbs such as « to call in on » becoming « rendre visite », (2) « sorry to hear that », that a human tra...

متن کامل

MULTILINGUAL MULTIWORD EXPRESSIONS Literature Survey

Multiword Expressions are idiosyncratic word usages of a language which often have noncompositional meaning. The knowledge of multiword expressions is necessary for many NLP tasks like, machine translation, natural language generation, named entity recognition, sentiment analysis etc. In order for other NLP applications to benefit from the knowledge of multiword expressions, they need to be ide...

متن کامل

A System for Compound Noun Multiword Expression Extraction for Hindi

Compound noun multiword expressions are important for many NLP applications like machine translation and information retrieval. This paper describes a system for Hindi compound noun multiword expressions (MWE) extraction from a given corpus. We identify major categories of compound noun MWEs, based on linguistic and psycholinguistic principles. Our extraction methods use various statistical co-...

متن کامل

A Supervised Model for Extraction of Multiword Expressions, Based on Statistical Context Features

We present a method for extracting Multiword Expressions (MWEs) based on the immediate context they occur in, using a supervised model. We show some of these contextual features can be very discriminant and combining them with MWEspecific features results in a relatively accurate extraction. We define context as a sequential structure and not a bag of words, consequently, it becomes much more i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Language Resources and Evaluation

دوره 44  شماره 

صفحات  -

تاریخ انتشار 2010