IBEnt: Chemical Entity Mentions in Patents using ChEBI

نویسندگان

  • Andre Lamurias
  • Luis F. Campos
  • Francisco M. Couto
چکیده

This article presents our approach to the CEMP task of BioCreative V.5, which consisted in using our system, IBEnt, to identify chemical entity mentions in patents through machine learning and semantic similarity techniques. The features used combine the results of a CRF classifier, two lexical matching methods (FiGO and MER) and semantic similarity measures on ChEBI ontology. We also tested the usage of MER by itself, without the machine learning approach. Combining these techniques, we submitted 5 runs for evaluation. We obtained better results using the machine learning approach with lexical and semantic similarity features. The best F-score obtained was 0.8541, while the MER system obtained 0.5967.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition

Chemical patents contain detailed information on novel chemical compounds that is valuable to the chemical and pharmaceutical industries. In this paper, we introduce a system, NERChem that can recognize chemical named entity mentions in chemical patents. NERChem is based on the conditional random fields model (CRF). Our approach incorporates (1) class composition, which is used for combining ch...

متن کامل

Overview of the CHEMDNER patents task

A considerable effort has been made to extract biological and chemical entities, as well as their relationships, from the scientific literature, either manually through traditional literature curation or by using information extraction and text mining technologies. Medicinal chemistry patents contain a wealth of information, for instance to uncover potential biomarkers that might play a role in...

متن کامل

Identifying Chemical Entities based on ChEBI

This software demonstration paper presents Identifying Chemical Entities (ICE), a platform composed by algorithms for chemical entity recognition, entity resolution to a reference database, namely ChEBI, and validation using chemical semantic similarity. It aims to provide the users with an improved display of entity recognition results, exposing outliers which are possible recognition errors a...

متن کامل

Evaluation of chemical and gene/protein entity recognition systems at BioCreative V.5: the CEMP and GPRO patents tracks

This paper presents the results of the BioCreative V.5 offline tasks related to the evaluation of the performance as well as assess progress made by strategies used for the automatic recognition of mentions of chemical names and gene in running text of medicinal chemistry patent abstracts. A total of 21 teams submitted results for at least one of these tasks. The CEMP (chemical entity mention i...

متن کامل

LASIGE: using Conditional Random Fields and ChEBI ontology

For participating in the SemEval 2013 challenge of recognition and classification of drug names, we adapted our chemical entity recognition approach consisting in Conditional Random Fields for recognizing chemical terms and lexical similarity for entity resolution to the ChEBI ontology. We obtained promising results, with a best F-measure of 0.81 for the partial matching task when using post-pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017