SUDOKU: Treating Word Sense Disambiguation & Entitiy Linking as a Deterministic Problem - via an Unsupervised & Iterative Approach

نویسنده

  • Steve L. Manion
چکیده

SUDOKU’s submissions to SemEval Task 13 treats Word Sense Disambiguation and Entity Linking as a deterministic problem that exploits two key attributes of open-class words as constraints – their degree of polysemy and their part of speech. This is an extension and further validation of the results achieved by Manion and Sainudiin (2014). SUDOKU’s three submissions are incremental in the use of the two aforementioned constraints. Run1 has no constraints and disambiguates all lemmas in one pass. Run2 disambiguates lemmas at increasing degrees of polysemy, leaving the most polysemous until last. Run3 is identical to Run2, with the additional constraint of disambiguating all named entities and nouns first before other types of open-class words (verbs, adjectives, and adverbs). Over all-domains, for English Run2 and Run3 were placed second and third. For Spanish Run2, Run3, and Run1 were placed first, second, and third respectively. For Italian Run1 was placed first with Run2 and Run3 placed second equal.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense Disambiguation

We introduce an iterative approach to subgraph-based Word Sense Disambiguation (WSD). Inspired by the Sudoku puzzle, it significantly improves the precision and recall of disambiguation. We describe how conventional subgraph-based WSD treats the two steps of (1) subgraph construction and (2) disambiguation via graph centrality measures as ordered and atomic. Consequently, researchers tend to fo...

متن کامل

VUA-background : When to Use Background Information to Perform Word Sense Disambiguation

We present in this paper our submission to task 13 of SemEval2015, which makes use of background information and external resources (DBpedia and Wikipedia) to automatically disambiguate texts. Our approach follows two routes for disambiguation: one route is proposed by a state–of–the–art WSD system, and the other one by the predominant sense information extracted in an unsupervised way from an ...

متن کامل

An Iterative Approach for Unsupervised Most Frequent Sense Detection using WordNet and Word Embeddings

Given a word, what is the most frequent sense in which it occurs in a given corpus? Most Frequent Sense (MFS) is a strong baseline for unsupervised word sense disambiguation. If we have large amounts of sense-annotated corpora, MFS can be trivially created. However, senseannotated corpora are a rarity. In this paper, we propose a method which can compute MFS from raw corpora. Our approach itera...

متن کامل

A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge

Word sense disambiguation is the process of determining which sense of a word is used in a given context. Due to its importance in understanding semantics of natural languages, word sense disambiguation has been extensively studied in Computational Linguistics. However, existing methods either are brittle and narrowly focus on specific topics or words, or provide only mediocre performance in re...

متن کامل

Unsupervised Multilingual Word Sense Disambiguation via an Interlingua

We present an unsupervised method for resolving word sense ambiguities in one language by using statistical evidence assembled from other languages. It is crucial for this approach that texts are mapped into a language-independent interlingual representation. We also show that the coverage and accuracy resulting from multilingual sources outperform analyses where only monolingual training data ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015