Imegrating Domain and Paradigmatic Similarity for unsupervised Sense Tagging

نویسندگان

  • Roberto Basili
  • Marco Cammisa
  • Alfio Massimiliano Gliozzo
چکیده

An unsupervised methodology for Word Sense Disambiguation, called Dynamic Domain Sense Tagging, is presented. It relies on the convergence of two very well known unsupervised approaches (i.e. Domain Driven Disambiguation and Conceptual Density). For each target word a domain is dynamically modeled by expanding the its topical context, i.e. a set of words evoking the underlying/implict domain where the word is located. The estimation of the paradigmatic similarity within such a specific lexicon is assumed as a disambiguation model. The Conceptual Density measure is here used to account for paradigmatic associations, and the top scored senses of the target word are selected accordingly. Results confirm the impact of domain based representation in capturing useful paradigmatic generalizations, especially when small text fragments are available. In addition, the precision/recall tradeoff of the resulting method can be tuned in a meaningful way, allowing us to achieve impressively high precision scores in a purely unsupervised setting.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Part-Of-Speech Tagging Supporting Supervised Methods

This paper investigates the utility of an unsupervised partof-speech (PoS) system in a task oriented way. We use PoS labels as features for different supervised NLP tasks: Word Sense Disambiguation, Named Entity Recognition and Chunking. Further we explore, how much supervised tagging can gain from unsupervised tagging. A comparative evaluation between variants of systems using standard PoS, un...

متن کامل

Using wikipedia and supersense tagging for semi-automatic complex taxonomy construction

In this paper we propose an unsupervised approach for acquiring domain related conceptual hierarchies from open-domain text. Super Sense Tagging (SST) is used to extract up-level terms and Wikipedia categories and WordNet are employed to construct the rest of taxonomic hierarchy. The result is a complete top-bottom taxonomy for every formal context. We describe both the method we implemented an...

متن کامل

Text: now in 2D! A framework for lexical expansion with contextual similarity

A new metaphor of two-dimensional text for data-driven semantic modeling of natural language is proposed, which provides an entirely new angle on the representation of text: not only syntagmatic relations are annotated in the text, but also paradigmatic relations are made explicit by generating lexical expansions. We operationalize dis-tributional similarity in a general framework for large cor...

متن کامل

Unsupervised and supervised exploitation of semantic domains in lexical disambiguation

Domains are common areas of human discussion, such as economics, politics, law, science etc., which are at the basis of lexical coherence. This paper explores the dual role of domains in word sense disambiguation (WSD). On one hand, domain information provides generalized features at the paradigmatic level that are useful to discriminate among word senses. On the other hand, domain distinctions...

متن کامل

Unsupervised Domain Adaptation for Word Sense Disambiguation using Stacked Denoising Autoencoder

In this paper, we propose an unsupervised domain adaptation for Word Sense Disambiguation (WSD) using Stacked Denoising Autoencoder (SdA). SdA is an unsupervised learning method of obtaining the abstract feature set of input data using Neural Network. The abstract feature set absorbs the difference of domains, and thus SdA can solve a problem of domain adaptation. However, SdA does not always c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006