Identifying Poorly-Defined Concepts in WordNet with Graph Metrics
نویسندگان
چکیده
Princeton WordNet is the most widely-used lexical resource in natural language processing and continues to provide a gold standard model of semantics. However, there are still significant quality issues with the resource and these affect the performance of all NLP systems built on this resource. One major issue is that many nodes are insufficiently defined and new links need to be added to increase performance in NLP. We combine the use of graph-based metrics with measures of ambiguity in order to predict which synsets are difficult for word sense disambiguation, a major NLP task, which is dependent on good lexical information. We show that this method allows use to find poorly defined nodes with a 89.9% precision, which would assist manual annotators to focus on improving the most in-need parts of the WordNet graph.
منابع مشابه
An Approximation Approach for Semantic Queries of Naïve Users by a New Query Language
This paper focuses on querying semi structured data such as RDF data, using a proposed query language for the non-expert user, in the context of a lack knowledge structure. This language is inspired from the semantic regular path queries. The problem appears when the user specifies concepts that are not in the structure, as approximation approaches, operations based on query modifications and c...
متن کاملVoting Theory for Concept Detection
This paper explores the issue of detecting concepts for ontology learning from text. We investigate various metrics from graph theory and propose various voting schemes based on these metrics. The idea draws its root in social choice theory, and our objective is to mimic consensus in automatic learning methods and increase the confidence in concept extraction through the identification of the b...
متن کاملOpportunistic Search with Semantic Fisheye Views EPFL Technical Report: IC/2004/42
Search goals are often too complex or poorly defined to be solved in a single query. While refining their search goals, users are likely to apply a variety of strategies, such as searching for more general or more specific concepts in reaction to the information and structures they encounter in the results. This is called opportunistic search. In this paper we describe how semantic fisheye view...
متن کاملLexical Chains on WordNet and Extensions
Lexical chains between two concepts are sequences of semantically related words interconnected via semantic relations. This paper presents a new approach for the automatic construction of lexical chains on knowledge bases. Experiments were performed building lexical chains on WordNet, Extended WordNet, and Extended WordNet Knowledge Base. The research addresses the problems of lexical chains ra...
متن کاملDomain-Specific Knowledge Acquisition Using WordNet
This paper presents a method that acquires new concepts and connections associated with user-selected seed concepts, and adds them to the WordNet linguistic knowledge structure. New domain knowledge can be acquired around some seed concepts that a user considers important. The knowledge we seek to acquire relates to one or more of these concepts, and consists of new concepts not defined in Word...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016