Weighted sampling without replacement from data streams

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weighted Sampling Without Replacement from Data Streams

Weighted sampling without replacement has proved to be a very important tool in designing new algorithms. Efraimidis and Spirakis (IPL 2006) presented an algorithm for weighted sampling without replacement from data streams. Their algorithm works under the assumption of precise computations over the interval [0, 1]. Cohen and Kaplan (VLDB 2008) used similar methods for their bottom-k sketches. ...

متن کامل

Accelerating weighted random sampling without replacement

Random sampling from discrete populations is one of the basic primitives in statistical computing. This article briefly introduces weighted and unweighted sampling with and without replacement. The case of weighted sampling without replacement appears to be most difficult to implement efficiently, which might be one reason why the R implementation performs slowly for large problem sizes. This p...

متن کامل

Weighted Random Sampling over Data Streams

In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2,4]), discuss sampling with and without replacement and show adaptations of the algorithms for several WRS problems and evolving data streams.

متن کامل

Edgeworth Expansions for Sampling without Replacement from Finite Populations

The validity of the one-term Edgeworth expansion is proved for the multivariate mean of a random sample drawn without replacement under a limiting non-latticeness condition on the population. The theorem is applied to deduce the oneterm expansion for the univariate statistics which can be expressed in a certain linear plus quadratic form. An application of the results to the theory of bootstrap...

متن کامل

Min-wise independent sampling from skewed data streams

Min-wise independent hashing is a powerful sampling technique for estimating the similarity between sets. In particular, it has proved to be ubiquitous for mining data streams of large volume where the input sets are revealed in arbitrary order and the elements in a given set do not arrive consecutively. More precisely, for sets of elements E and attributes A the input is a stream of element-at...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Processing Letters

سال: 2015

ISSN: 0020-0190

DOI: 10.1016/j.ipl.2015.07.007