variance reduction technique

Variance Reduction for Stochastic Gradient Optimization

2013

Chong Wang Xi Chen Alexander J. Smola Eric P. Xing

Stochastic gradient optimization is a class of widely used algorithms for training machine learning models. To optimize an objective, it uses the noisy gradient computed from the random data samples instead of the true gradient computed from the entire dataset. However, when the variance of the noisy gradient is large, the algorithm might spend much time bouncing around, leading to slower conve...

متن کامل

Variance Reduction in Stochastic Gradient Langevin Dynamics

Journal: :Advances in neural information processing systems 2016

Kumar Avinava Dubey Sashank J. Reddi Sinead Williamson Barnabás Póczos Alexander J. Smola Eric P. Xing

Stochastic gradient-based Monte Carlo methods such as stochastic gradient Langevin dynamics are useful tools for posterior inference on large scale datasets in many machine learning applications. These methods scale to large datasets by using noisy gradients calculated using a mini-batch or subset of the dataset. However, the high variance inherent in these noisy gradients degrades performance ...

متن کامل

Stochastic Variance Reduction for Nonconvex Optimization

2016

Sashank J. Reddi Ahmed Hefny Suvrit Sra Barnabás Póczos Alexander J. Smola

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (Svrg) methods for them. Svrg and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (Sgd); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary...

متن کامل

Stochastic Variance Reduction Methods for Policy Evaluation

2017

Simon S. Du Jianshu Chen Lihong Li Lin Xiao Dengyong Zhou

Policy evaluation is a crucial step in many reinforcement-learning procedures, which estimates a value function that predicts states’ longterm value under a given policy. In this paper, we focus on policy evaluation with linear function approximation over a fixed dataset. We first transform the empirical policy evaluation problem into a (quadratic) convex-concave saddle point problem, and then ...

متن کامل

Asymmetric Variance Reduction for Pricing American Options

2007

Jean-Pierre Fouque Chuan-Hsiang Han

Based on the dual formulation by Rogers (2002), Monte Carlo algorithms to estimate the high-biased and low-biased estimates for American option prices are proposed. Bounds for pricing errors and the variance of biased estimators are shown to be dependent on hedging martingales. These martingales are applied to (1) simultaneously reduce the error bound and the variance of the high-biased estimat...

متن کامل

Variance Reduction for Distributed Stochastic Gradient Descent

Journal: :CoRR 2015

Soham De Gavin Taylor Tom Goldstein

Variance reduction (VR) methods boost the performance of stochastic gradient descent (SGD) by enabling the use of larger, constant stepsizes and preserving linear convergence rates. However, current variance reduced SGD methods require either high memory usage or an exact gradient computation (using the entire dataset) at the end of each epoch. This limits the use of VR methods in practical dis...

متن کامل

Variance reduction trends on 'boosted' classifiers

Journal: :JAMDS 2004

Virginia Wheway

Ensemble classification techniques such as bagging, (Breiman, 1996a), boosting (Freund & Schapire, 1997) and arcing algorithms (Breiman, 1997) have received much attention in recent literature. Such techniques have been shown to lead to reduced classification error on unseen cases. Even when the ensemble is trained well beyond zero training set error, the ensemble continues to exhibit improved ...

متن کامل

Variance Reduction Using Nonreversible Langevin Samplers

2016

A B Duncan T Lelièvre G A Pavliotis

A standard approach to computing expectations with respect to a given target measure is to introduce an overdamped Langevin equation which is reversible with respect to the target distribution, and to approximate the expectation by a time-averaging estimator. As has been noted in recent papers [30, 37, 61, 72], introducing an appropriately chosen nonreversible component to the dynamics is benef...

متن کامل

Variance Reduction Methods I 1 Simulation Efficiency

2004

Martin Haugh

Suppose as usual that we wish to estimate θ := E[h(X)]. Then the standard simulation algorithm is: 2. Estimate θ with θ n = n j=1 Y j /n where Y j := h(X j). 3. Approximate 100(1 − α)% confidence intervals are then given by θ n − z 1−α/2 σ n √ n , θ n + z 1−α/2 σ n √ n where σ n is the usual estimate of Var(Y) based on Y 1 ,. .. , Y n. One way to measure the quality of the estimator, θ n , is b...

متن کامل

Stochastic Conjugate Gradient Algorithm with Variance Reduction

Journal: :CoRR 2017

Xiao-Bo Jin Xu-Yao Zhang Kaizhu Huang Guanggang Geng

Conjugate gradient methods are a class of important methods for solving linear equations and nonlinear optimization. In our work, we propose a new stochastic conjugate gradient algorithm with variance reduction (CGVR) and prove its linear convergence with the Fletcher and Revves method for strongly convex and smooth functions. We experimentally demonstrate that the CGVR algorithm converges fast...

متن کامل