A Corpus-based Conceptual Clustering Method for Verb Frames and Ontology Acquisition
نویسندگان
چکیده
We describe in this paper the ML system, ASIUM, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requiring subcategorization frames and ontologies are crucial and numerous. The most direct applications are semantic checking of texts and syntactic parsing improvement but also text generation and translation. The inputs of ASIUM result from syntactic parsing of texts, they are subcategorization examples and basic clusters formed by head words that occur with the same verb after the same preposition (or with the same syntactical role). ASIUM successively aggregates the clusters to form new concepts in the form of a generality graph that represents the ontology of the domain. Subcategorization frames are learned in parallel, so that as concepts are formed, they fill restrictions of selection in the subcategorization frames. ASIUM method is based on conceptual clustering. First experiments have been performed on a corpora of cooking recipes and give very promising results reported here.
منابع مشابه
Acquisition of Semantic Knowledge using Machine learningmethods : The System
We describe in this paper the ML system ASIUM which acquires semantic knowledge from parsed technical texts. ASIUM is devoted to the acquisition of case frames and ontologies. Applications requiring case frames and ontologies are numerous. The Dassault Aviation company we are collaborating with is mainly interested in controlling semantics of speciication texts, in terminology acquisition for s...
متن کاملA Step-wise Usage-based Method for Inducing Polysemy-aware Verb Classes
We present an unsupervised method for inducing verb classes from verb uses in gigaword corpora. Our method consists of two clustering steps: verb-specific semantic frames are first induced by clustering verb uses in a corpus and then verb classes are induced by clustering these frames. By taking this step-wise approach, we can not only generate verb classes based on a massive amount of verb use...
متن کاملUnsupervised mining for ontology terms
Ontologies in current computer science parlance are computer based resources that represent agreed domain semantics. This paper first very shortly introduces the DOGMA ontology engineering approach that separates ”atomic” conceptual relations from ”predicative” domain rules. Secondly, we describe and experimentally evaluate work in progress on a potential method to automatically derive the atom...
متن کاملDiathesis alternation approximation for verb clustering
Although diathesis alternations have been used as features for manual verb classification, and there is recent work on incorporating such features in computational models of human language acquisition, work on large scale verb classification has yet to examine the potential for using diathesis alternations as input features to the clustering process. This paper proposes a method for approximati...
متن کاملAFAST: An Automatic Frames Acquisition System
This paper describes an unsupervised strategy to acquire lexico-semantic frames (LSFs) of verbs from sentential parsed corpora (in syntactic level). The problems of acquiring LSFs consist of verb senses ambiguity, diversity of linguistic usages, and lack of completed frame slots in a single sentence. We propose an specific clustering technique based on the Minimum Description Length (MDL) princ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998