A New Method for Evaluating Automatically Learned Terminological Taxonomies

نویسندگان

  • Paola Velardi
  • Roberto Navigli
  • Stefano Faralli
  • Juana María Ruiz-Martínez
چکیده

Evaluating a taxonomy learned automatically against an existing gold standard is a very complex problem, because differences stem from the number, label, depth and ordering of the taxonomy nodes. In this paper we propose casting the problem as one of comparing two hierarchical clusters. To this end we defined a variation of the Fowlkes and Mallows measure (Fowlkes and Mallows, 1983). Our method assigns a similarity value B (l,r) to the learned (l) and reference (r) taxonomy for each cut i of the corresponding anonymised hierarchies, starting from the topmost nodes down to the leaf concepts. For each cut i, the two hierarchies can be seen as two clusterings C l , C i r of the leaf concepts. We assign a prize to early similarity values, i.e. when concepts are clustered in a similar way down to the lowest taxonomy levels (close to the leaf nodes). We apply our method to the evaluation of the taxonomy learning methods put forward by Navigli et al. (2011) and Kozareva and Hovy (2010).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Taxonomy Learning Using Word Sense Induction

Taxonomies are an important resource for a variety of Natural Language Processing (NLP) applications. Despite this, the current stateof-the-art methods in taxonomy learning have disregarded word polysemy, in effect, developing taxonomies that conflate word senses. In this paper, we present an unsupervised method that builds a taxonomy of senses learned automatically from an unlabelled corpus. O...

متن کامل

Learning Relations for Terminological Ontologies from Text

The problem of learning concept hierarchies and terminological ontologies can be decomposed into two sub-tasks: concept extraction and relation learning. We describe an new approach to learn relations automatically from unstructured text corpus based on one of the probabilistic topic models, Latent Dirichlet Allocation. We first provide definition (Information Theory Principle for Concept Relat...

متن کامل

TaxoMap in the OAEI 2007 Alignment Contest

This paper presents our first participation in the OAEI 2007 campaign. It describes an approach to align taxonomies which relies on terminological and structural techniques applied sequentially. We performed our method with various taxonomies using our prototype, TaxoMap. Previous experimental results were encouraging and demonstrate the relevance of this alignment approach. In this paper, we e...

متن کامل

Towards a Standardized Linguistic Annotation of the Textual Content of Labels in Knowledge Representation Systems

We propose applying standardized linguistic annotation to terms included in labels of knowledge representation schemes (taxonomies or ontologies), hypothesizing that this would help improving ontology-based semantic annotation of texts. We share the view that currently used methods for including lexical and terminological information in such hierarchical networks of concepts are not satisfactor...

متن کامل

A suggested Motivational Method for Teaching Scientific Terminology, With a Practical Example

Using a reductionist approach, the motivational method for teaching scientific terminology aims at breaking down terms and their definitions into separate components, i.e. morphemes and their semantic features, rather than establishing a connection between terms and their definitions as holistic units. In other words, the ultimate goal of this method is achieving “semantic motivation,” (as oppo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012