MORSE: Semantic-ally Drive-n MORpheme SEgment-er
نویسندگان
چکیده
In this paper we present a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. We validate our algorithm across different datasets and languages and present new state-of-the-art results.
منابع مشابه
Utilizing prosody for unconstrained morpheme recognition
Speech recognition systems for languages with a rich in ectional morphology (like German) su er from the limitations of a word{based full{form lexicon. Although the morphological and acoustical knowledge about words is coded implicitly within the lexicon entries (which are usually closely related to the orthography of the language at hand) this knowledge is usually not explicitly available for ...
متن کاملZero Morphemes in Unification-Based Combinatory Categorial Grammar
In this paper, we report on our use of zero morphemes in Unification-Based Combinatory Categorial Grammar. After illustrating the benefits of this approach with several examples, we describe the algorithm for compiling zero morphemes into unary rules, which allows us to use zero morphemes more efficiently in natural language processing. 1 Then, we discuss the question of equivalence of a gramma...
متن کاملUnsupervised Morphological Segmentation Based on Segment Predictability and Word Segments Alignment
Word segments are relevant cues for the automatic acquisition of semantic relationships from morphologically related words. Indeed, morphemes are the smallest meaning-bearing units. We present an unsupervised method for the segmentation of words into sub-units devised for this objective. The system relies on segment predictability to discover a set of prefixes and suffixes and performs word seg...
متن کاملConstituency, Implicit Arguments, and Scope in the Syntax-semantics of Degree Constructions
We propose an adjunction-based analysis of comparative (and similar) constructions that captures the morphological and semantic relation between comparative heads (e.g., -er, as) and comparative clauses (e.g., those headed by than or as). Primarily motivating this proposal are the syntactic similarities that we observe between comparative constructions and other adjunction structures in grammar...
متن کاملIdentifying Different Meanings of a Chinese Morpheme through Latent Semantic Analysis and Minimum Spanning Tree Analysis
A character corresponds roughly to a morpheme in Chinese, and it usually takes on multiple meanings. In this paper, we aimed at capturing the multiple meanings of a Chinese morpheme across polymorphemic words in a growing semantic micro-space. Using Latent Semantic Analysis (LSA), we created several nested LSA semantic micro-spaces of increasing size. The term-document matrix of the smallest se...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017