نتایج جستجو برای: variance reduction technique

تعداد نتایج: 1160769  

2013
Chong Wang Xi Chen Alexander J. Smola Eric P. Xing

Stochastic gradient optimization is a class of widely used algorithms for training machine learning models. To optimize an objective, it uses the noisy gradient computed from the random data samples instead of the true gradient computed from the entire dataset. However, when the variance of the noisy gradient is large, the algorithm might spend much time bouncing around, leading to slower conve...

Journal: :Advances in neural information processing systems 2016
Kumar Avinava Dubey Sashank J. Reddi Sinead Williamson Barnabás Póczos Alexander J. Smola Eric P. Xing

Stochastic gradient-based Monte Carlo methods such as stochastic gradient Langevin dynamics are useful tools for posterior inference on large scale datasets in many machine learning applications. These methods scale to large datasets by using noisy gradients calculated using a mini-batch or subset of the dataset. However, the high variance inherent in these noisy gradients degrades performance ...

2016
Sashank J. Reddi Ahmed Hefny Suvrit Sra Barnabás Póczos Alexander J. Smola

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (Svrg) methods for them. Svrg and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (Sgd); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary...

2017
Simon S. Du Jianshu Chen Lihong Li Lin Xiao Dengyong Zhou

Policy evaluation is a crucial step in many reinforcement-learning procedures, which estimates a value function that predicts states’ longterm value under a given policy. In this paper, we focus on policy evaluation with linear function approximation over a fixed dataset. We first transform the empirical policy evaluation problem into a (quadratic) convex-concave saddle point problem, and then ...

2007
Jean-Pierre Fouque Chuan-Hsiang Han

Based on the dual formulation by Rogers (2002), Monte Carlo algorithms to estimate the high-biased and low-biased estimates for American option prices are proposed. Bounds for pricing errors and the variance of biased estimators are shown to be dependent on hedging martingales. These martingales are applied to (1) simultaneously reduce the error bound and the variance of the high-biased estimat...

Journal: :CoRR 2015
Soham De Gavin Taylor Tom Goldstein

Variance reduction (VR) methods boost the performance of stochastic gradient descent (SGD) by enabling the use of larger, constant stepsizes and preserving linear convergence rates. However, current variance reduced SGD methods require either high memory usage or an exact gradient computation (using the entire dataset) at the end of each epoch. This limits the use of VR methods in practical dis...

Journal: :JAMDS 2004
Virginia Wheway

Ensemble classification techniques such as bagging, (Breiman, 1996a), boosting (Freund & Schapire, 1997) and arcing algorithms (Breiman, 1997) have received much attention in recent literature. Such techniques have been shown to lead to reduced classification error on unseen cases. Even when the ensemble is trained well beyond zero training set error, the ensemble continues to exhibit improved ...

2016
A B Duncan T Lelièvre G A Pavliotis

A standard approach to computing expectations with respect to a given target measure is to introduce an overdamped Langevin equation which is reversible with respect to the target distribution, and to approximate the expectation by a time-averaging estimator. As has been noted in recent papers [30, 37, 61, 72], introducing an appropriately chosen nonreversible component to the dynamics is benef...

2004
Martin Haugh

Suppose as usual that we wish to estimate θ := E[h(X)]. Then the standard simulation algorithm is: 2. Estimate θ with θ n = n j=1 Y j /n where Y j := h(X j). 3. Approximate 100(1 − α)% confidence intervals are then given by θ n − z 1−α/2 σ n √ n , θ n + z 1−α/2 σ n √ n where σ n is the usual estimate of Var(Y) based on Y 1 ,. .. , Y n. One way to measure the quality of the estimator, θ n , is b...

Journal: :CoRR 2017
Xiao-Bo Jin Xu-Yao Zhang Kaizhu Huang Guanggang Geng

Conjugate gradient methods are a class of important methods for solving linear equations and nonlinear optimization. In our work, we propose a new stochastic conjugate gradient algorithm with variance reduction (CGVR) and prove its linear convergence with the Fletcher and Revves method for strongly convex and smooth functions. We experimentally demonstrate that the CGVR algorithm converges fast...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید