Continuous Monitoring of l_p Norms in Data Streams

نویسندگان

  • Jaroslaw Blasiok
  • Jian Ding
  • Jelani Nelson
چکیده

In insertion-only streaming, one sees a sequence of indices a1, a2, . . . , am ∈ [n]. The stream defines a sequence of m frequency vectors x, . . . , x ∈ R with (x)i def = |{j : j ∈ [t], aj = i}|. That is, x is the frequency vector after seeing the first t items in the stream. Much work in the streaming literature focuses on estimating some function f(x). Many applications though require obtaining estimates at time t of f(x), for every t ∈ [m]. Naively this guarantee is obtained by devising an algorithm with failure probability ≪ 1/m, then performing a union bound over all stream updates to guarantee that all m estimates are simultaneously accurate with good probability. When f(x) is some lp norm of x, recent works have shown that this union bound is wasteful and better space complexity is possible for the continuous monitoring problem, with the strongest known results being for p = 2 [HTY14, BCIW16, BCI17]. In this work, we improve the state of the art for all 0 < p < 2, which we obtain via a novel analysis of Indyk’s p-stable sketch [Ind06].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast, Provable Algorithms for Isotonic Regression in all L_p-norms

Given a directed acyclic graphG, and a set of values y on the vertices, the Isotonic Regression of y is a vector x that respects the partial order described by G, and minimizes ‖x− y‖ , for a specified norm. This paper gives improved algorithms for computing the Isotonic Regression for all weighted `p-norms with rigorous performance guarantees. Our algorithms are quite practical, and variants o...

متن کامل

Continuous Queries over Data Streams - Semantics and Implementation

Recent technological advances have pushed the emergence of a new class of data-intensive applications that require continuous processing over sequences of transient data, called data streams, in near real-time. Examples of such applications range from business activity monitoring and online analysis of sensor data to trend detection in stock ticker data. This work presents a solid and powerful ...

متن کامل

An XML Framework for Integrating Continuous Queries, Composite Event Detection, and Database Condition Monitoring for Multiple Data Streams

Current, data-driven applications have become more dynamic in nature, with the need to respond to events generated from distributed sources or to react to information extracted from incoming data streams. Event processing and stream processing have traditionally developed as two separate areas of research. Event processing has its roots in research with active rule processing (Widom and Ceri, 1...

متن کامل

Estimating Dominance Norms of Multiple Data Streams

There is much focus in the algorithms and database communities on designing tools to manage and mine data streams. Typically, data streams consist of multiple signals. Formally, a stream of multiple signals is (i, ai,j) where i’s correspond to the domain, j’s index the different signals and ai,j ≥ 0 give the value of the jth signal at point i. We study the problem of finding norms that are cumu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017