A Real Time Data Extraction, Transformation and Loading Solution for Semi-structured Text Files
نویسندگان
چکیده
Space applications’ users have been relying for the past decades on custom developed software tools capable of addressing short term necessities during critical Spacecraft control periods. Advances in computing power and storage solutions have made possible the development of innovative decision support systems. These systems are capable of providing high quality integrated data to both near real time and historical data analysis applications. This paper describes the implementation of a new approach for a distributed and loosely coupled data extraction and transformation solution capable of extracting, transforming and perform loading of relevant real-time and historical Space Weather and Spacecraft data from semi-structured text files into an integrated space-domain decision support system. The described solution takes advantage of XML and Web Service technologies and is currently working under operational environment at the European Space Agency as part of the Space Environment Information System for Mission Control Purposes (SEIS) project.
منابع مشابه
Extraction and transformation of data from semi-structured text files using a declarative approach
The World Wide Web is a major source of textual information, with a human-readable semi-structured format, referring to multiple domains, some of them highly complex. Traditional ETL approaches following the development of specific source code for each data source and based on multiple domain / computerscience experts interactions, become an inadequate solution, time consuming and prone to erro...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملExtraction, Transformation, and Loading
DEFINITION Extraction, Transformation, and Loading (ETL) processes are responsible for the operations taking place in the back stage of a data warehouse architecture. In a high level description of an ETL process, first, the data are extracted from the source data stores that can be On-Line Transaction Processing (OLTP) or legacy systems, files under any format, web pages, various kinds of docu...
متن کاملA New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملNavigating the Data Lake: Unsupervised Structure Extraction for Text-formatted Data
Many organizations routinely accumulate automatically-generated semi-structured log file datasets; these datasets remain unused and occupy wasted space—this phenomenon has been termed as the “data lake” problem. One approach to put these datasets to use is to convert them into a structured relational format, following which they can be analyzed in conjunction with other datasets. To address thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005