Relabeling Distantly Supervised Training Data for Temporal Knowledge Base Population
نویسندگان
چکیده
We enhance a temporal knowledge base population system to improve the quality of distantly supervised training data and identify a minimal feature set for classification. The approach uses multi-class logistic regression to eliminate individual features based on the strength of their association with a temporal label followed by semi-supervised relabeling using a subset of human annotations and lasso regression. As implemented in this work, our technique improves performance and results in notably less computational cost than a parallel system trained on the full feature set.
منابع مشابه
A distant supervised learning system for the TAC-KBP Slot Filling and Temporal Slot Filling Tasks
This paper describes the system implemented by the NLP GROUP AT UNED for our first participation in the Knowledge Base Population at the Text Analysis Conference (TACKBP). For this Slot Filling Task, our approach was to design a distant supervised learning system, which was then specialized for the Regular Slot Filling and Full Temporal Slot Filling subtasks. From the initial Knowledge Base and...
متن کاملMinimally Supervised Event Argument Extraction using Universal Schema
The prediction of events and their participants is an important component of building a knowledge base automatically from text. Typically, the events of interest are domain-specific and not known in advance, and so it is often the case that little or no training data is available to learn the appropriate predictors. In this work, we propose a technique for distantly supervised event argument ex...
متن کاملStanford's Distantly-Supervised Slot-Filling System
This paper describes the design and implementation of the slot filling system prepared by Stanford’s natural language processing group for the 2011 Knowledge Base Population (KBP) track at the Text Analysis Conference (TAC). Our system relies on a simple distant supervision approach using mainly resources furnished by the track’s organizers: we used slot examples from the provided knowledge bas...
متن کاملApplying UMLS for Distantly Supervised Relation Detection
This paper describes first results using the Unified Medical Language System (UMLS) for distantly supervised relation extraction. UMLS is a large knowledge base which contains information about millions of medical concepts and relations between them. Our approach is evaluated using existing relation extraction data sets that contain relations that are similar to some of those in UMLS.
متن کاملDistantly Labeling Data for Large Scale Cross-Document Coreference
Cross-document coreference, the problem of resolving entity mentions across multi-document collections, is crucial to automated knowledge base construction and data mining tasks. However, the scarcity of large labeled data sets has hindered supervised machine learning research for this task. In this paper we develop and demonstrate an approach based on “distantly-labeling” a data set from which...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012