A MapReduce Relational-Database Index-Selection Tool
ثبت نشده
چکیده
The physical design of data storage is a critical administrative task for optimizing system performance. Selecting indices properly is a fundamental aspect of the system design. Index selection optimization has been widely studied in DataBase Management Systems (DBMSs). However, current DBMS are not appropriate platforms for many data nowadays. As a result, several systems have been developed to deal with these data. An index-selection optimization approach is still needed in these systems. In fact, it is even more necessary since they process Big Data. Under these circumstances, developing an index-selection tool for large-scale systems is a vital requirement. This thesis focuses on the index-selection process in HadoopDB. The main contribution of the thesis is to utilize data mining techniques to develop a tool for recommending an optimal index-set configuration. Evaluation shows significant performance improvement on the tasks running time with the tool index-set configuration.
منابع مشابه
Index Selection in Relational Databases
Intending to develop a tool which aims to support the physical design of relational databases can not be done without considering the problem of index selection. Generally the problem is split into a primary and secondary index selection problem and the selection is done per table. Whereas much attention has been paid on the selection of secondary indices relatively less is known about the sele...
متن کاملAutomatic Optimization for MapReduce Programs
The MapReduce distributed programming framework has become popular, despite evidence that current implementations are inefficient, requiring far more hardware than a traditional relational databases to complete similar tasks. MapReduce jobs are amenable to many traditional database query optimizations (B+Trees for selections, column-storestyle techniques for projections, etc), but existing syst...
متن کاملOptimizing Theta-Joins in a MapReduce Environment
Data analyzing and processing are important tasks in cloud computing. In this field, the MapReduce framework has become a more and more popular tool to analyze large-scale data over large clusters. Compared with the parallel relational database, it has the advantages of excellent scalability and good fault tolerance. However, the performance of join operation using MapReduce is not as good as t...
متن کاملAdaptive and Automated Index Selection in RDBMS
We present a novel approach for a tool that assists the database administrator in designing an index connguration for a relational database system. A new methodology for collecting usage statistics at run time is developed which lets the optimizer estimate query execution costs for alternative index conngurations. Deening the workload specii-cation required by existing index design tools may be...
متن کاملIndex selection in relational databases - Computing and Information, 1993. Proceedings ICCI '93., Fifth International Conference on
Intending to develop a tool which aims to support the physical design of relational databases can not be done Without considering the problem of index selection. Generally the problem is split into a primary and secondary index selection problem and the selection is done per table. Whereas much attention has been paid on the selection of secondary indices relatively less is known about the sele...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014