Automatic Word Sense Discrimination

نویسنده

Hinrich Schütze

چکیده

This paper presents context-group discrimination, a disambiguation algorithm based on clustering. Senses are interpreted as groups (or clusters) of similar contexts of the ambiguous word. Words, contexts, and senses are represented in Word Space, a high-dimensional, real-valued space in which closeness corresponds to semantic similarity. Similarity in Word Space is based on second-order co-occurrence: two tokens (or contexts) of the ambiguous word are assigned to the same sense cluster if the words they co-occur with in turn occur with similar words in a training corpus. The algorithm is automatic and unsupervised in both training and application: senses are induced from a corpus without labeled training instances or other external knowledge sources. The paper demonstrates good performance of context-group discrimination for a sample of natural and artificial ambiguous words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UOY: A Hypergraph Model For Word Sense Induction & Disambiguation

This paper is an outcome of ongoing research and presents an unsupervised method for automatic word sense induction (WSI) and disambiguation (WSD). The induction algorithm is based on modeling the cooccurrences of two or more words using hypergraphs. WSI takes place by detecting high-density components in the cooccurrence hypergraphs. WSD assigns to each induced cluster a score equal to the sum...

متن کامل

Automatic Word Sense Disambiguation (wsd) System

This paper presents an automatic word sense disambiguation (WSD) system that uses Part-of-Speech (POS) tags along with word classes as the discrete features. Word Classes are derived from the Word Class Assigner using the Word Exchange Algorithm from statistical language processing. Naïve-Bayes classifier is employed from Weka in both the training and testing phases to perform the supervised le...

متن کامل

UPV-SI: Word Sense Induction using Self Term Expansion

In this paper we are reporting the results obtained participating in the “Evaluating Word Sense Induction and Discrimination Systems” task of Semeval 2007. Our totally unsupervised system performed an automatic self-term expansion process by mean of co-ocurrence terms and, thereafter, it executed the unsupervised KStar clustering method. Two ranking tables with different evaluation measures wer...

متن کامل

Do not do processing, when you can look up: Towards a Discrimination Net for WSD

The task of Word Sense Disambiguation (WSD) incorporates in its definition the role of ‘context’. We present our work on the development of a tool which allows for automatic acquisition and ranking of ‘context clues’ for WSD. These clue words are extracted from the contexts of words appearing in a large monolingual corpus. These mined collection of contextual clues form a discrimination net in ...

متن کامل

Discrimination of Word Senses with Hypernyms

Languages are inherently ambiguous. Four out of five words in English have more than one meaning. Nowadays there is a growing number of small proprietary thesauri used for knowledge management for different applications. In order to enable the usage of these thesauri for automatic text annotations, we introduce a robust method for discriminating word senses using hypernyms. The method uses coll...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Computational Linguistics

دوره 24 شماره

صفحات -

تاریخ انتشار 1998

Automatic Word Sense Discrimination

نویسنده

چکیده

منابع مشابه

UOY: A Hypergraph Model For Word Sense Induction & Disambiguation

Automatic Word Sense Disambiguation (wsd) System

UPV-SI: Word Sense Induction using Self Term Expansion

Do not do processing, when you can look up: Towards a Discrimination Net for WSD

Discrimination of Word Senses with Hypernyms

عنوان ژورنال:

اشتراک گذاری