WordNet and FrameNet as Complementary Resources for Annotation

نویسندگان

  • Collin F. Baker
  • Christiane Fellbaum
چکیده

WordNet and FrameNet are widely used lexical resources, but they are very different from each other and are often used in completely different ways in NLP. In a case study in which a short passage is annotated in both frameworks, we show how the synsets and definitions of WordNet and the syntagmatic information from FrameNet can complement each other, forming a more complete representation of the lexical semantic of a text than either could alone. Close comparisons between them also suggest ways in which they can be brought into alignment. 1 Background and motivation FrameNet and WordNet are two lexical databases that are widely used for NLP, often in conjunction. Because of their complementary designs they are obvious candidates for alignment, and an exploratory research project within the larger context of the semantic annotation of the the American national Corpus is currently underway. We give specific illustrative examples of annotations against both resources, highlighting their different contributions towards a rich semantic analysis. WordNet (WN):1 (Fellbaum, 1998), is a large electronic lexical database of English. Originally conceived as a full-scale model of human semantic organization, it was quickly embraced by the Natural Language Processing (NLP) community, a development that guided its subsequent growth and design. WordNet has become the lexical database of choice for NLP and has been incorporated into other language tools, including VerbNet (Kipper et al., 2000) and OntoNotes (Hovy et al., 2006). Numerous on-line dictionaries, including Google’s “define” function, rely significantly on WordNet. WordNet’s coverage is sometimes criticized as being too fine-grained for automatic processing, though its inventory is not larger than that of a standard collegiate dictionary. But the present limitation of automatic WSD cannot be entirely blamed on existing systems; for example, Fellbaum and Grabowski (1997) http://wordnet.princeton.edu have shown that humans, too, have difficulties identifying context-appropriate dictionary senses. One answer is clearly that meanings do not exist outside contexts. Furthermore, although WN does contain “sentence frames” such as “Somebody —-s something” for a transitive verb with a human agent, it provides little syntagmatic information, except for what can be gleaned from the example sentences. WordNet’s great strength is its extensive coverage, with more than 117,000 synonym sets (synsets), each with a definition and relations to other synsets covering almost all the general vocabulary of English. FrameNet (FN):2 (Fontenelle, 2003) is a lexical resource organized not around words per se, but semantic frames (Fillmore, 1976): characterizations of events, relations, and states which are the conceptual basis for understanding groups of word senses, called lexical units (LUs). Frames are distinguished by the set of roles involved, known as frame elements (FEs). Much of the information in the FrameNet lexicon is derived by annotating corpus sentences; for each LU, groups of sentences are extracted from a corpus, sentences which collectively exemplify all of the lexicographically relevant syntactic patterns in which the LU occurs. A few examples of each pattern are annotated; annotators not only mark the target word which evokes the frame in the mind of the hearer, but also mark those phrases which are syntactically related to the target word and express its frame elements. FrameNet is much smaller than WordNet, covering roughly 11,000 LUs, but contains very rich syntagmatic information about the combinatorial possibilities of each LU. Given these two lexical resources with different strengths, it seems clear that combining WN and FN annotation will produce a more complete semantic representation of the meaning of a text than either could alone. What follows is intended as an example of how they can usefully be combined. 2 Case Study: Aegean History The text chosen for this study is a paragraph from the American National Corpus3 (Ide et al., 2002), from the Berlitz travel guide to Greece, discussing the history of http://framenet.icsi.berkeley.edu http://www.americannationalcorpus.org

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel approach to mapping FrameNet lexical units to WordNet synsets (short paper)

In this paper we present a novel approach to mapping FrameNet lexical units to WordNet synsets in order to automatically enrich the lexical unit set of a given frame. While the mapping approaches proposed in the past mainly rely on the semantic similarity between lexical units in a frame and lemmas in a synset, we exploit the definition of the lexical entries in FrameNet and the WordNet glosses...

متن کامل

FrameNet and Linked Data

FrameNet is the ideal resource for representation as linked data, and several renderings of the resource in RDF/OWL have been created. FrameNet has also been and continues to be linked to other major resources, including WordNet, BabelNet, and MASC, in the Linguistic Linked Open Data cloud. Although so far the supporting technologies have not enabled easy and widespread access to the envisioned...

متن کامل

Empirical Comparisons of MASC Word Sense Annotations

We analyze how different conceptions of lexical semantics affect sense annotations and how multiple sense inventories can be compared empirically, based on annotated text. Our study focuses on the MASC project, where data has been annotated using WordNet sense identifiers on the one hand, and FrameNet lexical units on the other. This allows us to compare the sense inventories of these lexical r...

متن کامل

Extensive Evaluation of a FrameNet-WordNet mapping resource

Lexical resources are basic components of many text processing system devoted to information extraction, question answering or dialogue. In paste years many resources have been developed such as FrameNet and WordNet. FrameNet describes prototypical situations (i.e. Frames) while WordNet defines lexical meaning (senses) for the majority of English nouns, verbs, adjectives and adverbs. A major di...

متن کامل

Putting Pieces Together: Combining FrameNet, VerbNet and WordNet for Robust Semantic Parsing

This paper describes our work in integrating three different lexical resources: FrameNet, VerbNet, and WordNet, into a unified, richer knowledge-base, to the end of enabling more robust semantic parsing. The construction of each of these lexical resources has required many years of laborious human effort, and they all have their strengths and shortcomings. By linking them together, we build an ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009