Wikipedia-based Distributional Semantics for Entity Relatedness

نویسندگان

  • Nitish Aggarwal
  • Paul Buitelaar
چکیده

Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. We propose Wikipedia-based Distributional Semantics for Entity Relatedness (DiSER), which represents the semantics of an entity by its distribution in the high dimensional concept space derived from Wikipedia. DiSER measures the semantic relatedness between two entities by quantifying the distance between the corresponding high-dimensional vectors. DiSER builds the model by taking the annotated entities only, therefore it improves over existing approaches, which do not distinguish between an entity and its surface form. We evaluate the approach on a benchmark that contains the relative entity relatedness scores for 420 entity pairs. Our approach improves the accuracy by 12% on state of the art methods for computing entity relatedness. We also show an evaluation of DiSER in the Entity Disambiguation task on a dataset of 50 sentences with highly ambiguous entity mentions. It shows an improvement of 10% in precision over the best performing methods. In order to provide the resource that can be used to find out all the related entities for a given entity, a graph is constructed, where the nodes represent Wikipedia entities and the relatedness scores are reflected by the edges. Wikipedia contains more than 4.1 millions entities, which required efficient computation of the relatedness scores between the corresponding 17 trillions of entity-pairs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributional Semantics for Entity Relatedness

Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. In this work, we present a distributional semantics based approach for computing entity relatedness, and a focused related entities explorer based on this approach.

متن کامل

Evaluating Topic Coherence Using Distributional Semantics

This paper introduces distributional semantic similarity methods for automatically measuring the coherence of a set of words generated by a topic model. We construct a semantic space to represent each topic word by making use of Wikipedia as a reference corpus to identify context features and collect frequencies. Relatedness between topic words and context features is measured using variants of...

متن کامل

Syntactic/Semantic Structures for Textual Entailment Recognition

In this paper, we describe an approach based on off-the-shelf parsers and semantic resources for the Recognizing Textual Entailment (RTE) challenge that can be generally applied to any domain. Syntax is exploited by means of tree kernels whereas lexical semantics is derived from heterogeneous resources, e.g. WordNet or distributional semantics through Wikipedia. The joint syntactic/semantic mod...

متن کامل

Corpus Co-Occurrence, Dictionary and Wikipedia Entries as Resources for Semantic Relatedness Information

Distributional, corpus-based descriptions have frequently been applied to model aspects of word meaning. However, distributional models that use corpus data as their basis have one well-known disadvantage: Even though the distributional features based on corpus co-occurrence were often successful in capturing meaning aspects of the words to be described, they generally fail to capture those mea...

متن کامل

Exploring Related Entities

In this demo, we present Entity Relatedness Graph (EnRG), a focused related entities explorer, which provides the users with a dynamic set of filters and facets. It gives a ranked lists of related entities to a given entity, and clusters them using the different filters. For instance, using EnRG, one can easily find the American vegans related to Brad Pitt or Irish universities related to Seman...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014