Automatic Word Sense Discrimination
نویسنده
چکیده
This paper presents context-group discrimination, a disambiguation algorithm based on clustering. Senses are interpreted as groups (or clusters) of similar contexts of the ambiguous word. Words, contexts, and senses are represented in Word Space, a high-dimensional, real-valued space in which closeness corresponds to semantic similarity. Similarity in Word Space is based on second-order co-occurrence: two tokens (or contexts) of the ambiguous word are assigned to the same sense cluster if the words they co-occur with in turn occur with similar words in a training corpus. The algorithm is automatic and unsupervised in both training and application: senses are induced from a corpus without labeled training instances or other external knowledge sources. The paper demonstrates good performance of context-group discrimination for a sample of natural and artificial ambiguous words.
منابع مشابه
UOY: A Hypergraph Model For Word Sense Induction & Disambiguation
This paper is an outcome of ongoing research and presents an unsupervised method for automatic word sense induction (WSI) and disambiguation (WSD). The induction algorithm is based on modeling the cooccurrences of two or more words using hypergraphs. WSI takes place by detecting high-density components in the cooccurrence hypergraphs. WSD assigns to each induced cluster a score equal to the sum...
متن کاملAutomatic Word Sense Disambiguation (wsd) System
This paper presents an automatic word sense disambiguation (WSD) system that uses Part-of-Speech (POS) tags along with word classes as the discrete features. Word Classes are derived from the Word Class Assigner using the Word Exchange Algorithm from statistical language processing. Naïve-Bayes classifier is employed from Weka in both the training and testing phases to perform the supervised le...
متن کاملUPV-SI: Word Sense Induction using Self Term Expansion
In this paper we are reporting the results obtained participating in the “Evaluating Word Sense Induction and Discrimination Systems” task of Semeval 2007. Our totally unsupervised system performed an automatic self-term expansion process by mean of co-ocurrence terms and, thereafter, it executed the unsupervised KStar clustering method. Two ranking tables with different evaluation measures wer...
متن کاملDo not do processing, when you can look up: Towards a Discrimination Net for WSD
The task of Word Sense Disambiguation (WSD) incorporates in its definition the role of ‘context’. We present our work on the development of a tool which allows for automatic acquisition and ranking of ‘context clues’ for WSD. These clue words are extracted from the contexts of words appearing in a large monolingual corpus. These mined collection of contextual clues form a discrimination net in ...
متن کاملDiscrimination of Word Senses with Hypernyms
Languages are inherently ambiguous. Four out of five words in English have more than one meaning. Nowadays there is a growing number of small proprietary thesauri used for knowledge management for different applications. In order to enable the usage of these thesauri for automatic text annotations, we introduce a robust method for discriminating word senses using hypernyms. The method uses coll...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Linguistics
دوره 24 شماره
صفحات -
تاریخ انتشار 1998