KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

نویسندگان

چکیده

Abstract Pre-trained language representation models (PLMs) cannot well capture factual knowledge from text. In contrast, embedding (KE) methods can effectively represent the relational facts in graphs (KGs) with informative entity embeddings, but conventional KE take full advantage of abundant textual information. this paper, we propose a unified model for Knowledge Embedding and LanguagERepresentation (KEPLER), which not only better integrate into PLMs also produce effective text-enhanced strong PLMs. KEPLER, encode descriptions PLM as their then jointly optimize modeling objectives. Experimental results show that KEPLER achieves state-of-the-art performances on various NLP tasks, works remarkably an inductive KG link prediction. Furthermore, pre-training evaluating construct Wikidata5M1 , large-scale dataset aligned descriptions, benchmark it. It shall serve new facilitate research large KG, KE, The source code be obtained https://github.com/THU-KEG/KEPLER.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge representation and indexing using the unified medical language system.

Ontologies and semantic frameworks can be used to improve the accuracy and expressiveness of natural language processing for the purpose of extracting meaning from technical documents. This is especially true when a rich ontology such as the Unified Medical Language System (UMLS) is available. This paper reports on some tools being developed to make this possible and on some experience with a u...

متن کامل

A Pre-Trained Ensemble Model for Breast Cancer Grade Detection Based on Small Datasets

Background and Purpose: Nowadays, breast cancer is reported as one of the most common cancers amongst women. Early detection of the cancer type is essential to aid in informing subsequent treatments. The newest proposed breast cancer detectors are based on deep learning. Most of these works focus on large-datasets and are not developed for small datasets. Although the large datasets might lead ...

متن کامل

A Lexical Knowledge Representation Model for Natural Language Understanding

Knowledge representation is essential for semantics modeling and intelligent information processing. For decades researchers have proposed many knowledge representation techniques. However, it is a daunting problem how to capture deep semantic information effectively and support the construction of a large-scale knowledge base efficiently. This paper describes a new knowledge representation mod...

متن کامل

Knowledge Semantic Representation: A Generative Model for Interpretable Knowledge Graph Embedding

Knowledge representation is a critical topic in AI, and currently embedding as a key branch of knowledge representation takes the numerical form of entities and relations to joint the statistical models. However, most embedding methods merely concentrate on the triple fitting and ignore the explicit semantic expression, leading to an uninterpretable representation form. Thus, traditional embedd...

متن کامل

Unified Representation for E-government Knowledge Management

Every governmental office requires knowledge, in order to increase its effectiveness. Hence, there exists need for a representation scheme to represent all types of governmental knowledge: factual, terminological, inferential, and regulated knowledge. Such a scheme must be Web-based supported for use in egovernment. Among the many Web-based representation schemes, none can represent all types o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of the Association for Computational Linguistics

سال: 2021

ISSN: ['2307-387X']

DOI: https://doi.org/10.1162/tacl_a_00360