Tasks, Domains, and Languages for Information Extraction

نویسندگان

  • Boyan A. Onyshkevych
  • Mary Ellen Okurowski
  • Lynn Carlson
چکیده

The information extraction tasks for the ARPA TIPSTER program center on automatically filling object-oriented data structures, called templates, with information extracted from free text in news stories (for discussion of templates and objects, see "Template Design for Information Extraction" in this volume). With text as input, the TIPSTER systems first detect whether the text contains relevant information. If so, the systems extract specific instances of generic types of information that correspond to each slot in the template and output that information by filling the template slots in an appropriate data representation. These slots are then scored by using an automatic scoring program with templates produced by human analysts that serve as answer keys. Human analysts also prepared development set templates for each domain, which served as training models for system developers (for discussion of the data preparation effort, see "Corpora and Data Preparation for Information Extraction" in this volume).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تشخیص اسامی اشخاص با استفاده از تزریق کلمه‌های نامزد اسم در میدان‌های تصادفی شرطی برای زبان عربی

Named Entity Recognition and Extraction are very important tasks for discovering proper names including persons, locations, date, and time, inside electronic textual resources. Accurate named entity recognition system is an essential utility to resolve fundamental problems in question answering systems, summary extraction, information retrieval and extraction, machine translation, video interpr...

متن کامل

Tasks, domains, and languages

The Fifth Message Understanding Conference (MUC-5) involved the same tasks, domains and languages as th e information extraction portion of the ARPA TIPSTER program . These tasks center on automatically filling object oriented data structures, called templates, with information extracted from free text in news stories (for discussion o f templates and objects, see "Template Design for Informati...

متن کامل

Applying Stratosphere for Big Data Analytics

Analyzing big data sets as they occur in modern business and science applications requires query languages that allow for the specification of complex data processing tasks. Moreover, these ideally declarative query specifications have to be optimized, parallelized and scheduled for processing on massively parallel data processing platforms. This paper demonstrates the application of Stratosphe...

متن کامل

Automatic Multi-Lingual Information Extraction

Information Extraction(IE) is a burgeoning technique because of the explosion of internet. So far, most of the IE systems are focusing on English text; and most of them are in the supervised learning framework, which requires large amount of human labor; and most of them can only work in narrow domain, which is domain dependent. These systems are difficult to be ported to other languages, other...

متن کامل

Exploring the Relationship between Life Quality and Speaking Ability of Iranian Intermediate EFL Learners

Despite its direct relevance to second/foreign language learning, quality of life has been a neglected area within Second Language Acquisition (SLA) research. The present study sought to investigate the relationship between quality of life factors and speaking skill as one of the most challenging parts of L2 learning. To this end, an adapted version of life quality questionnaire originally devi...

متن کامل

Exploring the Relationship between Life Quality and Speaking Ability of Iranian Intermediate EFL Learners

Despite its direct relevance to second/foreign language learning, quality of life has been a neglected area within Second Language Acquisition (SLA) research. The present study sought to investigate the relationship between quality of life factors and speaking skill as one of the most challenging parts of L2 learning. To this end, an adapted version of life quality questionnaire originally devi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993