MORSE: Semantic-ally Drive-n MORpheme SEgment-er

نویسندگان

  • Tarek Sakakini
  • Suma Bhat
  • Pramod Viswanath
چکیده

In this paper we present a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. We validate our algorithm across different datasets and languages and present new state-of-the-art results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utilizing prosody for unconstrained morpheme recognition

Speech recognition systems for languages with a rich in ectional morphology (like German) su er from the limitations of a word{based full{form lexicon. Although the morphological and acoustical knowledge about words is coded implicitly within the lexicon entries (which are usually closely related to the orthography of the language at hand) this knowledge is usually not explicitly available for ...

متن کامل

Zero Morphemes in Unification-Based Combinatory Categorial Grammar

In this paper, we report on our use of zero morphemes in Unification-Based Combinatory Categorial Grammar. After illustrating the benefits of this approach with several examples, we describe the algorithm for compiling zero morphemes into unary rules, which allows us to use zero morphemes more efficiently in natural language processing. 1 Then, we discuss the question of equivalence of a gramma...

متن کامل

Unsupervised Morphological Segmentation Based on Segment Predictability and Word Segments Alignment

Word segments are relevant cues for the automatic acquisition of semantic relationships from morphologically related words. Indeed, morphemes are the smallest meaning-bearing units. We present an unsupervised method for the segmentation of words into sub-units devised for this objective. The system relies on segment predictability to discover a set of prefixes and suffixes and performs word seg...

متن کامل

Constituency, Implicit Arguments, and Scope in the Syntax-semantics of Degree Constructions

We propose an adjunction-based analysis of comparative (and similar) constructions that captures the morphological and semantic relation between comparative heads (e.g., -er, as) and comparative clauses (e.g., those headed by than or as). Primarily motivating this proposal are the syntactic similarities that we observe between comparative constructions and other adjunction structures in grammar...

متن کامل

Identifying Different Meanings of a Chinese Morpheme through Latent Semantic Analysis and Minimum Spanning Tree Analysis

A character corresponds roughly to a morpheme in Chinese, and it usually takes on multiple meanings. In this paper, we aimed at capturing the multiple meanings of a Chinese morpheme across polymorphemic words in a growing semantic micro-space. Using Latent Semantic Analysis (LSA), we created several nested LSA semantic micro-spaces of increasing size. The term-document matrix of the smallest se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017