Upper Modeling: organizing knowledge for natural language processing

نویسنده

  • John A. Bateman
چکیده

A general, reusable computational resource has been developed within the Penman text generation project for organizing domain knowledge appropriately for linguistic realization. This resource, called the u p p e r model , provides a domainand task-independent classification system' that supports sophisticated natural language processing while significantly simplifying the interface between domain-specific knowledge and general linguistic resources. This paper presents the results of our experiences in designing and using the upper model in a variety of applications over the past 5 years. In particular, we present our conclusions concerning the appropriate organization of an upper model, its domainindependence, and the types of interrelationships that need to be supported between upper model and grammar and semantics. I n t r o d u c t i o n : i n t e r f a c i n g w i t h a t e x t g e n e r a t i o n s y s t e m Consider the task of interfacing a domain-independent, reusable, general text generation system with a particular application domain, in order to allow that application to express system-internal information in one or more natural languages. Internal information needs to be related to strategies for expressing it. This could be done in a domain-specific way by coding how the application domain requires its information to appear. This is clearly problematic, however: it requires detailed knowledge on the part of the system builder both of how the generator controls its output forms and the kinds of information that the application domain contains. A more general solution to the interfacing problem is thus desirable. We have found that the definition of a mapping between knowledge and its linguistic expression is facilitated if it is possible to classify any particular instances of facts, states of affairs, situations, etc. that occur in terms of a set of general objects and relations of specified types that behave systematically with respect to their possible linguistic realizations. This approach has been followed within the PENMAN text generation system [Mann and Matthiessen, 1985; The Penman Project, 1989] where, over the past 5 years, we have been developing and using an extensive, domainand task-independent organization of knowledge that supports natural language generation: this level of organization is called the u p p e r mode l [Bateman et aL, 1990; Mann, 1985; Moore and Arens, 1985]. The majority of natural language processing systems currently planned or under development are now recognizing the necessity of some level of abstract 'semantic' organization similar to the upper model that classifies knowledge so that it may be more readily expressed linguisticaUy. 1 However, they mostly suffer from either a lack of theoretical constraint concerning their internal contents and organization and the necessary mappings between them and surface realization, or a lack of abstraction which binds them too closely with linguistic form. It is important both that the contents of such a level of abstraction be motivated on good theoretical grounds and that the mapping between that level and linguistic form is specifiable. Our extensive experiences with the implementation and use of a level of semantic organization of this kind within the PENMAN system now permit us to state some clear design criteria and a well-developed set of necessary functionalities. T h e U p p e r M o d e l ' s C o n t r i b u t i o n t o t h e S o l u t i o n t o t h e I n t e r f a c e P r o b l e m : D o m a i n i n d e p e n d e n c e a n d r e u s a b i l i t y The upper model decomposes the mapping problem by establishing a level of linguistically motivated knowledge organization specifically constructed as a reponse XIncluding, for example: the Functional Sentence Structure o f XTRA: [Allgayer et al., 1989]; [Chen and Cha, 1988]; [Dahlgren et al., 1989]; POLYGLOSS: [Emele et ai., 1990]; certain of the Domain and Text Structure Objects of SPOKESMAN: [Meteer, 1989]; TRANSLATOR: [Nixenberg et aL, 1987]; the Semantic Relations of ~UROTa^-D: [Steiner et al., 1987]; JANUS: [Weischedel, 1989]. Space naturally precludes detailed comparisons here: see [Bateman, 1990] for further discussion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Upper Modeling: A general organization of knowledge for natural language processing

A general, reusable computational resource has been developed within the Penman text generation project for organizing domain knowledge appropriately for linguistic realization. This resource, called the upper model, provides a domainand task-independent classi cation system that supports sophisticated natural language processing while signi cantly simplifying the interface between domain-speci...

متن کامل

Automatic Formal Verification of Conceptual Model Documentation by Means of Self-organizing Map

By using background knowledge of the general and specific domains and by processing new natural language corpus experts are able to produce a conceptual model for some specific domain. In this paper we present a model that tries to capture some aspects of this conceptual modeling process. This model is functionally organized into two information processing streams: one reflects the process of f...

متن کامل

Three Lessons in Creating a Knowledge Base to Enable Reasoning, Explanation and Dialog

Our work is driven by the hypothesis that for a program to answer questions, explain the answers, and engage in a dialog just like a human does, it must have an explicit representation of knowledge. Such explicit representations occur naturally in many situations such as engineering designs created by engineers, a software requirement created in unified modeling language or a process flow diagr...

متن کامل

Creating a Knowledge Base to Enable Explanation, Reasoning, and Dialog: Three Lessons

Our work is driven by the hypothesis that, for a program to answer questions, explain the answers, and engage in a dialog just as a human does, it must have an explicit representation of knowledge. Such explicit representations naturally occur in many situations such as in designs created by engineers, software requirements created in a unified modeling language or process flow diagrams created...

متن کامل

From the Generalized Upper Model Towards an Arabic Upper Model

This work introduces the notion of a computational resource for organizing knowledge developed for natural language realization, the Upper Model. The generalized upper model has been implemented mainly for Latin languages. Would such model be able to support Arabic with minor modifications? A limited number of areas where Arabic and English grammars differ are listed to display possible areas w...

متن کامل

Organizing Information

College of Library and Information Services University of Maryland College Park, MD 20742 Organizing Information Organizing information is at the heart of information science and is important in many other areas as well. In bibliographic and similar information systems it involves classification as well as the description of documents or other entities; in database management it is known as dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990