Subsampling for Chain-Referral Methods
نویسندگان
چکیده
We study chain-referral methods for sampling in social networks. These methods rely on subjects of the study recruiting other participants among their set of connections. This approach gives us the possibility to perform sampling when the other methods, that imply the knowledge of the whole network or its global characteristics, fail. Chainreferral methods can be implemented with random walks or crawling in the case of online social networks. However, the estimations made on the collected samples can have high variance, especially with small sample size. The other drawback is the potential bias due to the way the samples are collected. We suggest and analyze a subsampling technique, where some users are requested only to recruit other users but do not participate to the study. Assuming that the referral has lower cost than actual participation, this technique takes advantage of exploring a larger variety of population, thus decreasing significantly the variance of the estimator. We test the method on real social networks and on synthetic ones. As by-product, we propose a Gibbs like method for generating synthetic networks with desired properties.
منابع مشابه
A Novel Subsampling Method for 3D Multimodality Medical Image Registration Based on Mutual Information
Mutual information (MI) is a widely used similarity metric for multimodality image registration. However, it involves an extremely high computational time especially when it is applied to volume images. Moreover, its robustness is affected by existence of local maxima. The multi-resolution pyramid approaches have been proposed to speed up the registration process and increase the accuracy of th...
متن کاملTempering by Subsampling
In this paper we demonstrate that tempering Markov chain Monte Carlo samplers for Bayesian models by recursively subsampling observations without replacement can improve the performance of baseline samplers in terms of effective sample size per computation. We present two tempering by subsampling algorithms, subsampled parallel tempering and subsampled tempered transitions. We provide an asympt...
متن کاملOn Markov chain Monte Carlo methods for tall data
Markov chain Monte Carlo methods are often deemed too computationally intensive to be of any practical use for big data applications, and in particular for inference on datasets containing a large number n of individual data points, also known as tall datasets. In scenarios where data are assumed independent, various approaches to scale up the MetropolisHastings algorithm in a Bayesian inferenc...
متن کاملTowards scaling up Markov chain Monte Carlo: an adaptive subsampling approach
Markov chain Monte Carlo (MCMC) methods are often deemed far too computationally intensive to be of any practical use for large datasets. This paper describes a methodology that aims to scale up the Metropolis-Hastings (MH) algorithm in this context. We propose an approximate implementation of the accept/reject step of MH that only requires evaluating the likelihood of a random subset of the da...
متن کاملStatistically efficient thinning of a Markov chain sampler
It is common to subsample Markov chain samples to reduce the storage burden of the output. It is also well known that discarding k − 1 out of every k observations will not improve statistical efficiency. It is less frequently remarked that subsampling a Markov chain allows one to omit some of the computation beyond that needed to simply advance the chain. When this reduced computation is accoun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016