Vertical Data Migration in Large Near-Line Document Archives Based on Markov-Chain Predictions

نویسندگان

  • Achim Kraiss
  • Gerhard Weikum
چکیده

Large multimedia document archives hold most of their data in near-line tertiary storage libraries for cost reasons. This paper develops an integrated approach to the vertical data migration between the tertiary and secondary storage in that it reconciles speculative preloading, to mask the high latency of the tertiary storage, with the replacement policy of the secondary storage. In addition, it considers the interaction of these policies with the tertiary storage scheduling and controls preloading aggressiveness by taking contention for tertiary storage drives into account. The integrated migration policy is based on a continuous-time Markov-chain (CTMC) model for predicting the expected number of accesses to a document within a specified time horizon. The parameters of the CTMC model, the probabilities of co-accessing certain documents and the interaction times between successive accesses, are dynamically estimated and adjusted to evolving workload patterns by keeping online statistics. The integrated policy for vertical data migration has been implemented in a prototype system. Detailed simulation studies with Web-server-like synthetic workloads indicate significant gains in terms of client response time. The studies also show that the overhead of the statistical bookkeeping and the computations for the access predictions is affordable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance evaluation of the croissant production line with reparable machines

In this study, the analytical probability models for an automated serial production system, bufferless that consists of n-machines in series with common transfer mechanism and control system was developed. Both time to failure and time to repair a failure are assumed to follow exponential distribution. Applying those models, the effect of system parameters on system performance in actu...

متن کامل

Markov Chain Anticipation for the Online Traveling Salesman Problem by Simulated Annealing Algorithm

The arc costs are assumed to be online parameters of the network and decisions should be made while the costs of arcs are not known. The policies determine the permitted nodes and arcs to traverse and they are generally defined according to the departure nodes of the current policy nodes. In on-line created tours arc costs are not available for decision makers. The on-line traversed nodes are f...

متن کامل

A New Model to Speculate CLV Based on Markov Chain Model

The present study attempts to establish a new framework to speculate customer lifetime value by a stochastic approach. In this research the customer lifetime value is considered as combination of customer’s present and future value. At first step of our desired model, it is essential to define customer groups based on their behavior similarities, and in second step a mechanism to count current ...

متن کامل

Efficient moment calculations for variance components in large unbalanced crossed random effects models

Large crossed data sets, described by generalized linear mixed models, have become increasingly common and provide challenges for statistical analysis. At very large sizes it becomes desirable to have the computational costs of estimation, inference and prediction (both space and time) grow at most linearly with sample size. Both traditional maximum likelihood estimation and numerous Markov cha...

متن کامل

Application of Markov-Chain Analysis and Stirred Tanks in Series Model in Mathematical Modeling of Impinging Streams Dryers

In spite of the fact that the principles of impinging stream reactors have been developed for more than half a century, the performance analysis of such devices, from the viewpoint of the mathematical modeling, has not been investigated extensively. In this study two mathematical models were proposed to describe particulate matter drying in tangential impinging stream dryers. The models were de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997