Inter-Operator Feedback in Data Stream Management Systems via Punctuation

نویسندگان

  • Rafael Fernández-Moctezuma
  • Kristin Tufte
  • Jin Li
چکیده

High-volume, high-speed data streams may overwhelm the capabilities of stream processing systems; techniques such as data prioritization, avoidance of unnecessary processing and ondemand result production may be necessary to reduce processing requirements. However, the dynamic nature of data streams, in terms of both rate and content, makes the application of such techniques challenging. Such techniques have been addressed in the context of static and centralized query optimization; however, they have not been fully addressed for data-stream management systems. In this work, we present a comprehensive framework designed to support prioritization, avoidance of unnecessary work, and on-demand result production over distributed, unreliable, bursty, disordered data sources, typical of many streams. We propose a form of inter-operator feedback, which flows against the stream direction, to communicate the information needed to enable execution of these techniques. This feedback leverages punctuations to describe the subsets of interest. We identify potential sources of feedback information, characterize new types of punctuation to support feedback, and describe the roles of producers, exploiters, and relayers of feedback that query operators may implement. We also present initial experimental observations using the NiagaraST data-stream system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Heartbeat Mechanism and Its Application in Gigascope

Data stream management systems often rely on ordering properties of tuple attributes in order to implement non-blocking operators. However, query operators that work with multiple streams, such as stream merge or join, can often still block if one of the input stream is very slow or bursty. In principle, punctuation and heartbeat mechanisms have been proposed to unblock streaming operators. In ...

متن کامل

A Quality-Centric Data Model for Distributed Stream Management Systems

It is challenging for large-scale stream management systems to return always perfect results when processing data streams originating from distributed sources. Data sources and intermediate processing nodes may fail during the lifetime of a stream query. In addition, individual nodes may become overloaded due to processing demands. In practice, users have to accept incomplete or inaccurate quer...

متن کامل

Using Control Theory to Guide Load Shedding in Medical Data Stream Management System

The load shedding problem is vital to a Data Stream Management System (DSMS). This paper presents the design, implementation, and evaluation of a load shedding method under the guide of the feedback control theory, in order to solve practical problems in medical environment. Thus, the using of operator selectivity, which has been proven not stable enough, is avoided. This paper focuses on the r...

متن کامل

LSTM for punctuation restoration in speech transcripts

The output of automatic speech recognition systems is generally an unpunctuated stream of words which is hard to process for both humans and machines. We present a two-stage recurrent neural network based model using long short-term memory units to restore punctuation in speech transcripts. In the first stage, textual features are learned on a large text corpus. The second stage combines textua...

متن کامل

Exploiting Punctuation Semantics in Data Streams

Applications that process data streams are becoming common: financial applications process streams of stock ticker data; telephone network monitoring applications process streams of call data. These applications often are queries over streams, so it seems natural to use a database management system instead of a custom application. However, some traditional relational operators are not conducive...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/0909.2062  شماره 

صفحات  -

تاریخ انتشار 2009