Intelligent information extraction from scholarly document databases
نویسندگان
چکیده
منابع مشابه
Information Extraction from Symbolically Compressed Document Images
The extraction of information from symbolically compressed document images is an increasingly important problem as the related standard (JBIG2) and commercial products become available. Symbolic compression techniques work by clustering individual connected connected components (blobs) in a document image and storing the sequence of occurrence of blobs and representative blob templates, hence t...
متن کاملInformation Extraction from Multi-Document Threads
Information extraction (IE) is the task of extracting fragments of important information from natural language documents. Most IE research involves algorithms for learning to exploit regularities inherent in the textual information and language use, and such systems generally assume that each document can be processed in isolation. We are extending IE techniques to multi-document extraction tas...
متن کاملOCR++: A Robust Framework For Information Extraction from Scholarly Articles
This paper proposes OCR++, an open-source framework designed for a variety of information extraction tasks from scholarly articles including metadata (title, author names, affiliation and e-mail), structure (section headings and body text, table and figure headings, URLs and footnotes) and bibliography (citation instances and references). We analyze a diverse set of scientific articles written ...
متن کاملBig Scholarly Data in CiteSeerX: Information Extraction from the Web
We examine CiteSeerX, an intelligent system designed with the goal of automatically acquiring and organizing largescale collections of scholarly documents from the world wide web. From the perspective of automatic information extraction and modes of alternative search, we examine various functional aspects of this complex system in order to investigate and explore ongoing and future research de...
متن کاملInformation Extraction: Beyond Document Retrieval
In this paper we give a synoptic view of the growth text processing technology of information extraction (IE) whose function is to extract information about a pre-specified set of entities, relations or events from natural language textsand to record this information in structured representations called templates. Here we describe the nature of the IE task, review the history of the area from i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Intelligence Studies in Business
سال: 2020
ISSN: 2001-015X,2001-0168
DOI: 10.37380/jisib.v10i2.584