Averaged Least-Mean-Squares: Bias-Variance Trade-offs and Optimal Sampling Distributions
نویسندگان
چکیده
We consider the least-squares regression problem and provide a detailed asymptotic analysis of the performance of averaged constant-step-size stochastic gradient descent. In the strongly-convex case, we provide an asymptotic expansion up to explicit exponentially decaying terms. Our analysis leads to new insights into stochastic approximation algorithms: (a) it gives a tighter bound on the allowed step-size; (b) the generalization error may be divided into a variance term which is decaying as O(1/n), independently of the step-size γ, and a bias term that decays as O(1/γn); (c) when allowing non-uniform sampling of examples over a dataset, the choice of a good sampling density depends on the trade-off between bias and variance: when the variance term dominates, optimal sampling densities do not lead to much gain, while when the bias term dominates, we can choose larger step-sizes that lead to significant improvements.
منابع مشابه
Constant Step Size Least-Mean-Square: Bias-Variance Trade-offs and Optimal Sampling Distributions
We consider the least-squares regression problem and provide a detailed asymptotic analysis of the performance of averaged constant-step-size stochastic gradient descent (a.k.a. least-mean-squares). In the strongly-convex case, we provide an asymptotic expansion up to explicit exponentially decaying terms. Our analysis leads to new insights into stochastic approximation algorithms: (a) it gives...
متن کاملHarder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
We consider the optimization of a quadratic objective function whose gradients are only accessible through a stochastic oracle that returns the gradient at any given point plus a zero-mean finite variance random error. We present the first algorithm that achieves jointly the optimal prediction error rates for least-squares regression, both in terms of forgetting the initial conditions in O(1/n)...
متن کاملUniform CR Bound: Implement ation Issues And Applications
We apply a uniform Cramer-Rao (CR) bound [l] to study the bias-variance trade-offs in single photon emission computed tomography (SPECT) image reconstruction. The uniform CR bound is used to specify achievable and unachievable regions in the bias-variance trade-off plane. The image reconstruction algorithms considered in this paper are: 1) Space alternating generalized EM and 2) penalized weigh...
متن کاملGeneralized Spatial Two Stage Least Squares Estimation of Spatial Autoregressive Models with Autoregressive Disturbances in the Presence of Endogenous Regressors and Many Instruments
This paper studies the generalized spatial two stage least squares (GS2SLS) estimation of spatial autoregressive models with autoregressive disturbances when there are endogenous regressors with many valid instruments. Using many instruments may improve the efficiency of estimators asymptotically, but the bias might be large in finite samples, making the inference inaccurate. We consider the ca...
متن کاملAveraged Least-Mean-Square: Bias-Variance Trade-offs and Optimal Sampling Distributions Supplementary material
Throughout our results we will use the following notations and results. These are necessary to provide explicit expressions for the constants in the asymptotic expansions. For any real vector space V of finite dimension d, let M(V ) be the space of linear operators over V which is isomorphic to the space of d-by-d matrices, with the usual results that composition becomes matrix multiplication. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015