Mix 'n' match multi-engine analytics
نویسندگان
چکیده
Current platforms fail to efficiently cope with the data and task heterogeneity of modern analytics workflows due to their adhesion to a single data and/or compute model. As a remedy, we present IReS, the Intelligent Resource Scheduler for complex analytics workflows executed over multi-engine environments. IReS is able to optimize a workflow with respect to a user-defined policy relying on cost and performance models of the required tasks over the available platforms. This optimization consists in allocating distinct workflow parts to the most advantageous execution and/or storage engine among the available ones and deciding on the exact amount of resources provisioned. Our current prototype supports 5 compute and 3 data engines, yet new ones can effortlessly be added to IReS by virtue of its engine-agnostic mechanisms. Our extensive experimental evaluation confirms that IReS speeds up diverse and realistic workflows by up to 30% compared to their optimal single-engine plan by automatically scattering parts of them to different execution engines and datastores. Its optimizer incurs only marginal overhead to the workflow execution performance, managing to discover the optimal execution plan within a few seconds, even for large-scale workflow instances.
منابع مشابه
Robust and Adaptive Multi-Engine Analytics using IReS
The complexity of Big Data analytics has long outreached the capabilities of current platforms, which fail to efficiently cope with the data and task heterogeneity of modern workflows due to their adhesion to a single data and/or compute model. As a remedy, we demonstrate IReS, the Intelligent Resource Scheduler for complex analytics workflows executed over multi-engine environments. IReS is ab...
متن کاملThe Case for Multi-Engine Data Analytics
As big data analytics have become an important driver for ICT development, a large variety of approaches that apply these advanced technologies on a wide spectrum of applications has been introduced. In this paper we argue on the need of a multi-engine environment that will exploit the largely different models, cost and quality of the existing analytics engines. Such an environment further requ...
متن کاملCollaborative Interfaces for Data Composition and Visualisation
This research proposes the development of interfaces to support collaborative, community-driven inquiry into data, which we refer to as Participatory Data Analytics. Since the investigation is led by local communities, it is not possible to anticipate which data will be relevant and what questions are going to be asked. Therefore, users have to be able to construct and tailor visualisations to ...
متن کاملMulti-Dimensional Simulation of n-Heptane Combustion under HCCI Engine Condition Using Detailed Chemical Kinetics
In this study, an in-house multi-dimensional code has been developed which simulates the combustion of n-heptane in a Homogeneous Charge Compression Ignition (HCCI) engine. It couples the flow field computations with detailed chemical kinetic scheme which involves the multi-reactions equations. A chemical kinetic scheme solver has been developed and coupled for solving the chemical reactions an...
متن کاملOptimizing, Planning and Executing Analytics Workflows over Multiple Engines
Big data analytics have become a necessity to businesses worldwide. The complexity of the tasks they execute is ever increasing due to the surge in data and task heterogeneity. Current analytics platforms, while successful in harnessing multiple aspects of this “data deluge”, bind their efficacy to a single data and compute model and often depend on proprietary systems. However, no single execu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016