Limits to pattern-matching, finite-state techniques on lexical analysis for multiword expressions: the case for compound adverbs in Spanish

نویسندگان

  • Dolors Català
  • Jorge Baptista
چکیده

Pattern-matching finite-state techniques applied to lexical analysis of multiword expressions have received a significant boost from the possibility of intersecting lexical information encoded in lexicongrammar matrices and reference graphs. However, these methods show important limitations in real-life applications. This paper aims at assessing and describing the main drawbacks of this technique when applied to Spanish compound adverbs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Lexical Interface for Finite-State Syntax

This document describes the lexical interface for nite-state syntax as it is currently implemented and used for the development of the French constraint grammar. The system includes nite-state transducers for multiword expressions, for capitalised, misspelt or unknown words and for accent recovery. It also encodes general multiword expressions such as dates or idioms. The tokeniser includes a n...

متن کامل

Spanish Adverbial Frozen Expressions

This paper presents an electronic dictionary of Spanish adverbial frozen expressions. It focuses on their formal description in view of natural language processing and presents an experiment on the automatic application of this data to real texts using finite-state techniques. The paper makes an assessment of the advantages and limitations of this method for the identification of these multiwor...

متن کامل

A Non-deterministic Tokeniser for Finite-State Parsing

This paper describes a non-deterministic tokeniser implemented and used for the development of a French finite-state grammar. The tokeniser includes a finite-state automaton for simple tokens and a lexical transducer that encodes a wide variety of multiword expressions, associated with multiple lexical descriptions when required.

متن کامل

Managing Multiword Expressions in a Lexicon-Based Sentiment Analysis System for Spanish

This paper describes our approach to managing multiword expressions in Sentitext, a linguistically-motivated, lexicon-based Sentiment Analysis (SA) system for Spanish whose performance is largely determined by its coverage of MWEs. We defend the view that multiword constructions play a fundamental role in lexical Sentiment Analysis, in at least three ways. First, a significant proportion convey...

متن کامل

Compound Temporal Adverbs in Portuguese and in Spanish

This paper reports on an ongoing research on temporal adverbs and deals with the problem of processing a family of Portuguese and Spanish compound temporal adverbs, in a contrastive approach, aiming at building finite state transducers to translate them from one language into the other. Because of the large number of combinations involved and their complexity, it is not easy to list them in ful...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007