Automatic Induction of Synsets from a Graph of Synonyms

نویسندگان

  • Dmitry Ustalov
  • Alexander Panchenko
  • Christian Biemann
چکیده

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clustering approach lets us use an efficient hard clustering algorithm to perform a fuzzy clustering of the graph. Despite its simplicity, our approach shows excellent results, outperforming five competitive state-of-the-art methods in terms of F-score on three gold standard datasets for English and Russian derived from large-scale manually constructed lexical resources.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Watset: Automatic Induction of Synsets from a Graph of Synonyms

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clus...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

An Experiment: Using Google Translate and Semantic Mirrors to Create Synsets with Many Lexical Units

One of the fundamental building blocks of a wordnet is synonym sets or synsets, which group together similar word meanings or synonyms. These synsets can consist either one or more synonyms. This paper describes an automatic method for composing synsets with multiple synonyms by using Google Translate and Semantic Mirrors’ method. Also, we will give an overview of the results and discuss the ad...

متن کامل

Enriching a Portuguese WordNet using Synonyms from a Monolingual Dictionary

In this article we present an exploratory approach to enrich a WordNet-like lexical ontology with the synonyms present in a standard monolingual Portuguese dictionary. The dictionary was converted from PDF into XML and senses were automatically identified and annotated. This allowed us to extract them, independently of definitions, and to create sets of synonyms (synsets). These synsets were th...

متن کامل

Automatic Evaluation of Wordnet Synonyms and Hypernyms

In recent times, wordnets have become indispensable resources for Natural Language Processing. However, the creation of wordnets is a time consuming and manpower intensive proposition. This fact has led to attempts at quickly fixing a wordnet using text repositories such as the web and certain corpora, and also by translating an existing wordnet into another language. However, the results of su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017