Adapting Open Information Extraction to Domain-Specific Relations
نویسندگان
چکیده
dent processes and domain-specific knowledge. Until recently, information extraction has leaned heavily on domain knowledge, which requires either manual engineering or manual tagging of examples (Miller et al. 1998; Soderland 1999; Culotta, McCallum, and Betz 2006). Semisupervised approaches (Riloff and Jones 1999, Agichtein and Gravano 2000, Rosenfeld and Feldman 2007) require only a small amount of hand-annotated training, but require this for every relation of interest. This still presents a knowledge engineering bottleneck, when one considers the unbounded number of relations in a diverse corpus such as the web. Shinyama and Sekine (2006) explored unsupervised relation discovery using a clustering algorithm with good precision, but limited scalability. The KnowItAll research group is a pioneer of a new paradigm, Open IE (Banko et al. 2007, Banko and Etzioni 2008), that operates in a totally domain-independent manner and at web scale. An Open IE system makes a single pass over its corpus and extracts a diverse set of relational tuples without requiring any relation-specific human input. Open IE is ideally suited to corpora such as the web, where the target relations are not known in advance and their number is massive. Articles
منابع مشابه
A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملDomain Adaptive Information Extraction From Text
An information extraction system is designed to operate over a specific domain, and cannot be applied to new domains without being adapted if it is to perform well. We will investigate the problem of adapting information extraction systems to new domains by first defining the task of information extraction and giving an example of an information extraction system. We will then outline the modul...
متن کاملAutomatic Discovery of Linguistic Patterns for Information Extraction
Information Extraction (IE) systems typically rely on extraction patterns encoding domain-specific knowledge. When matched against natural language texts, these patterns recognize with high accuracy information relevant to the extraction task. Adapting an IE system to a new extraction scenario entails devising a new collection of extraction patterns a time-consuming and expensive process. To ov...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملSelecting Domain-Specific Concepts for Question Generation With Lightly-Supervised Methods
In this paper we propose content selection methods for question generation (QG) which exploit domain knowledge. Traditionally, QG systems apply syntactical transformation on individual sentences to generate open domain questions. We hypothesize that a QG system informed by domain knowledge can ask more important questions. To this end, we propose two lightly-supervised methods to select salient...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- AI Magazine
دوره 31 شماره
صفحات -
تاریخ انتشار 2010