OCMiner: Text Processing, Annotation and Relation Extraction for the Life Sciences

نویسندگان

  • Timo Böhme
  • Matthias Irmer
  • Anett Püschel
  • Claudia Bobach
  • Ulf Laube
  • Lutz Weber
چکیده

We present OCMiner, a high-performance text processing system for large document collections of scientific publications. Several linguistic options allow adjusting the quality of annotation results which can be specialized and fine-tuned for the recognition of Life Science terms. Recognized terms are mapped to semantic concepts which are ontologically located within their respective domain taxonomies. Relying on a correct identification and semantic interpretation of mentions of domain concepts, relations between entities are extracted. The annotated text, as well as extracted knowledge triples, can be visualized on a web-based front-end at http://www.ocminer.com/, permitting an explorative information retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OCMiner for Patents. Extracting Chemical Information from Patent Texts

This paper describes OCMiner, a high-performance semantic text processing system for large document collections of scientific publications, and its performance regarding chemical named entity recognition in patent texts within the BioCreative V CHEMDNER-Patents challenge which was set up for this purpose. OCMiner permits adjusting the quality of annotation results by several linguistic options,...

متن کامل

Adapting the OCMiner text processing system to the CTD controlled vocabulary

We adapted OCMiner, a modular text processing pipeline especially suited for high-speed processing of large document collections, to a specific controlled vocabulary as given by the Comparative Toxicogenomic Database (CTD). We provide a RESTful web service which processes documents given in the BioCreative XML format and annotates them with domainspecific terms from the CTD domains genes, chemi...

متن کامل

Fuzzy Neighbor Voting for Automatic Image Annotation

With quick development of digital images and the availability of imaging tools, massive amounts of images are created. Therefore, efficient management and suitable retrieval, especially by computers, is one of themost challenging fields in image processing. Automatic image annotation (AIA) or refers to attaching words, keywords or comments to an image or to a selected part of it. In this paper,...

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014