Active Database Learning
نویسنده
چکیده
Learning from Past Query Processing— In today’s databases, the answers to past queries barely benefit processing future queries. The query answers and the work performed for processing queries—such as I/O and computations—are no longer used after returning query answers. Database Learning (DBL) [4] proposes to change this paradigm in an approximate query processing (AQP) context. DBL uses its knowledge acquired from past queries and their answers for speeding up future query processing. The more queries are processed, the smarter a system becomes; thus, the faster it can process new queries. DBL is based on the observation that the answers to queries (even if they access different subsets of rows and columns) stem from the same underlying distribution (which produced the entire data). Therefore, an answer to each query reveals a piece of information about this underlying distribution, which can be used to construct a concise statistical model of the distribution. Using a statistical model could bring a significant performance benefits. In an ideal case where the model precisely captures the underlying distribution, we could answer queries by analytically evaluating the model, instead of reading and processing terabytes of raw data. Even an imprecise model can still be beneficial; for instance, one can use a small sample of the entire data to quickly produce a sample-based answer [1], which can then be combined with the model to produce a more accurate approximate answer. Our prototype [4]—that uses the maximum entropy principle for model construction and SparkSQL for distributed computations—showed 73.7% support of real-world query workloads and 10.43 times of possible speedup compared to existing AQP systems (at the same accuracy level).
منابع مشابه
Intelligent Mediation in Active Knowledge Mining: Goals and General Description
Recent years have seen the increasing development of knowledge discovery or database mining systems that combine database management technology with machine learning techniques and algorithms to perform the analysis of data. Many of these systems use passive database management .systems to hold the data rather than active databases. However, applications such as battlemanagement situations or s...
متن کاملAnnotating retrieval database with active learning
In this paper, we describe a retrieval system that uses hidden annotation to improve the performance. The contribution of this paper is a novel active learning framework that can improve the annotation efficiency. For each object in the database, we maintain a list of probabilities, each indicating the probability of this object having one of the attributes. This list of probabilities serves as...
متن کاملActive Learning for Video Annotation
In this paper, we present an approach to active learning for video annotation. We use active learning to aide in the semantic labeling of video databases. A software library and simulator have been developed as a tool to investigate and measure the results of active learning. A list of confidence values and nearest neighbor values is maintained for each video segment in the database. A variety ...
متن کاملA Review of Statistics and Probability Journals in ISI Database
As in recent years the scientific productivity about ISI database and other related database have been increased, it is eligible for researchers of Statistics in Iran to know more about these journals and their statues in ISI database. In this study with the use of bibliometric methods, we have reviewed the status of Statistics and Probability . From all nations around the world, these are only...
متن کاملActive Learning: An Approach for Reducing Theory-Practice Gap in Clinical Education
Introduction: The gap between theory and practice in clinical fields, including nursing, is one of the main problems that many solutions have been suggested to eliminate it. In this article, we have tried to investigate its solution through active learning. Methods: In this review article, searching articles published during 2000-2012 was done through library references, scientific databases. ...
متن کاملExploration of Arak Medical Students’ Experiences on Effective Factors in Active Learning: A Qualitative Research
Introduction:: Medical students should use active learning to improve their daily duties and medical services. The goal of this study is exploring medical students’ experiences on effective factors in active learning. Methods: This qualitative study was conducted through content Analysis method in Arak University of Medical Sciences. Data were collected via interviews. The study started with p...
متن کامل