Exploring Long Tail Data in Distantly Supervised Relation Extraction
نویسندگان
چکیده
Distant supervision is an efficient approach for various tasks, such as relation extraction. Most of the recent literature on distantly supervised relation extraction generates labeled data by heuristically aligning knowledge bases with text corpora and then trains supervised relation classification models based on statistical learning. However, extracting long tail relations from the automatically labeled data is still a challenging problem even in big data. Inspired by explanation-based learning (EBL), this paper proposes an EBL-based approach to tackle this problem. The proposed approach can learn relation extraction rules effectively using unlabeled data. Experiments on the New York Times corpus demonstrate that our approach outperforms the baseline approach especially on long tail data.
منابع مشابه
Exploring Fine-grained Entity Type Constraints for Distantly Supervised Relation Extraction
Distantly supervised relation extraction, which can automatically generate training data by aligning facts in the existing knowledge bases to text, has gained much attention. Previous work used conjunction features with coarse entity types consisting of only four types to train their models. Entity types are important indicators for a specific relation, for example, if the types of two entities...
متن کاملApplying UMLS for Distantly Supervised Relation Detection
This paper describes first results using the Unified Medical Language System (UMLS) for distantly supervised relation extraction. UMLS is a large knowledge base which contains information about millions of medical concepts and relations between them. Our approach is evaluated using existing relation extraction data sets that contain relations that are similar to some of those in UMLS.
متن کاملCombining Distant and Partial Supervision for Relation Extraction
Broad-coverage relation extraction either requires expensive supervised training data, or suffers from drawbacks inherent to distant supervision. We present an approach for providing partial supervision to a distantly supervised relation extractor using a small number of carefully selected examples. We compare against established active learning criteria and propose a novel criterion to sample ...
متن کاملBootstrapping Distantly Supervised IE Using Joint Learning and Small Well-Structured Corpora
We propose a framework to improve the performance of distantly-supervised relation extraction, by jointly learning to solve two related tasks: concept-instance extraction and relation extraction. We further extend this framework to make a novel use of document structure: in some small, wellstructured corpora, sections can be identified that correspond to relation arguments, and distantly-labele...
متن کاملDeep Residual Learning for Weakly-Supervised Relation Extraction
Deep residual learning (ResNet) (He et al., 2016) is a new method for training very deep neural networks using identity mapping for shortcut connections. ResNet has won the ImageNet ILSVRC 2015 classification task, and achieved state-of-theart performances in many computer vision tasks. However, the effect of residual learning on noisy natural language processing tasks is still not well underst...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016