Query Performance Prediction for Analytical

نویسنده

  • Jennie Duggan
چکیده

of “ Query Performance Prediction for Analytical Workloads ” by Jennie Duggan, Ph.D., Brown University, May 2013 Modeling the complex interactions that arise when query workloads share computing resources and data is challenging albeit critical for a number of tasks such as Quality of Service (QoS) management in the emerging cloud-based database platforms, effective resource allocation for time-sensitive processing tasks, and user-experience management for interactive systems. In our work, we develop practical models for query performance prediction (QPP) for heterogeneous, concurrent query workloads in analytical databases. Specifically, we propose and evaluate several learning-based solutions for QPP. We first address QPP for static workloads that originate from well-known query classes. Then, we propose a more general solution for dynamic, ad hoc workloads. Finally, we address the issue of generalizing QPP for different hardware platforms such as those available from cloud-service providers. Our solutions use a combination of isolated and concurrent query execution samples, as well as new query workload features and metrics that can capture how different query classes behave for various levels of resource availability and contention. We implemented our solutions on top of PostgreSQL and evaluated them experimentally by quantifying their effectiveness for analytical data and workloads, represented by the established benchmark suites TPC-H and TPC-DS. The results show that learning-based QPP can be both feasible and effective for many static and dynamic workload scenarios. Query Performance Prediction for Analytical Workloads

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contender: A Resource Modeling Approach for Concurrent Query Performance Prediction

Predicting query performance under concurrency is a difficult task that has many applications in capacity planning, cloud computing, and batch scheduling. We introduce Contender, a new resourcemodeling approach for predicting the concurrent query performance of analytical workloads. Contender’s unique feature is that it can generate effective predictions for both static as well as adhoc or dyna...

متن کامل

Towards the Study of Performance Trade-offs Between Materialized and Virtual Integrated Views

Consider the problem of supporting an integrated view over multiple databases. The traditional approach is to use a virtual view, but recent investigations are proposing to use a materialized view, or a hybrid virtual/materialized view. This paper initiates an investigation into the performance trade-oos along this spectrum of choices. In particular, the paper develops analytical models for pre...

متن کامل

DWPPT: Data Warehouse Performance Prediction Tool

The increasing demands for interactive response time from the users makes query performance one of the central problems of Data warehouse systems today. Performance is an important quality aspect of Data warehouse systems. Predicting the performance of Data warehouse systems during early design stages of their development is significant. Software Performance Engineering(SPE) promotes the idea t...

متن کامل

Improving document retrieval according to prediction of query difficulty

Our experiments in the Robust track this year focused on predicting query difficulty and using this prediction for improving information retrieval. We developed two prediction algorithms and used the subsequent prediction in several ways in order to improve the performance of the search engine. These included modifying the search engine parameters, using selective query expansion, and switching...

متن کامل

An Empirical Study of Query Specificity

We analyse the statistical behavior of query-associated quantities in query-logs, namely, the sum and mean of IDF of query terms, otherwise known as query specificity and query mean specificity. We narrow down the possibilities for modeling their distributions to gamma, log-normal, or log-logistic, depending on query length and on whether the sum or the mean is considered. The results have appl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012