TimeML-Compliant Analysis of Text Documents

نویسندگان

  • Branimir K. Boguraev
  • Rie K. Ando
چکیده

Reasoning with temporal information1 requires a representation of time considerably more involved than just a list of temporal expressions—which typically define the extent of current time extraction efforts. TimeML is an emerging standard for temporal annotation, defining a language for expressing properties and relationships among timedenoting expressions and events in free text. This paper takes the position that TimeML is a good starting point for bridging the gap between temporal analysis of documents and reasoning with information derived from these documents. TimeML-compliant analysis is hard; and the task is made even harder by the small size of the only annotated corpus available to date. To address this, and related, challenges, we have developed and implemented a hybrid TimeML annotator, which uses cascaded finite-state grammars (for temporal expresison analysis, shallow syntactic parsing, and feature generation) together with a machine learning component capable of effectively using large amounts of unannotated data. We motivate our mixed strategy; this is work in progress, and we report interim results on the first effort to use the TIMEBANK corpus for building an operational TimeML analyser. This work was supported by the Advanced Research and Development Activity under the Novel Intelligence and Massive Data (NIMD) program PNWD-SW-6059.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of TimeBank as a Resource for TimeML Parsing

We present an analysis of the TimeBank corpus—the only available reference for TimeML-compliant annotation—from the point of view of its utility as a training resource for developing automated TimeML annotators. Experimental results indicative of the potential of TimeBank are encouraging; at the same time, closer inspection of causes for some systematic errors shows certain deficiencies in the ...

متن کامل

TimeBank-Driven TimeML Analysis

The design of TimeML as an expressive language for temporal information brings promises, and challenges; in particular, its representational properties raise the bar for traditional information extraction methods applied to the task of text-to-TimeML analysis. A reference corpus, such as TimeBank, is an invaluable asset in this situation; however, certain characteristics of TimeBank—size and co...

متن کامل

Recognition of Polish Temporal Expressions

In this article we present the result of the recent research in the recognition of Polish temporal expressions. The temporal information extracted from the text plays major role in many information extraction systems, like question answering, event recognition or discourse analysis. We prepared a broad description of Polish temporal expressions, called PLIMEX. It is based on the state-of-the-ar...

متن کامل

TimeML-strict: clarifying temporal annotation

TimeML is an XML-based schema for annotating temporal information over discourse. The standard has been used to annotate a variety of resources and is followed by a number of tools, the creation of which constitute hundreds of thousands of man-hours of research work. However, the current state of resources is such that many are not valid, or do not produce valid output, or contain ambiguous or ...

متن کامل

Analysing Temporally Annotated Corpora with CAVaT

We present CAVaT, a tool that performs Corpus Analysis and Validation for TimeML. CAVaT is an open source, modular checking utility for statistical analysis of features specific to temporally-annotated natural language corpora. It provides reporting, highlights salient links between a variety of general and time-specific linguistic features, and also validates a temporal annotation to ensure th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004