Towards Faster Stochastic Gradient Search
نویسندگان
چکیده
Stochastic gradient descent is a general algorithm which includes LMS, on-line backpropagation, and adaptive k-means clustering as special cases. The standard choices of the learning rate 1] (both adaptive and fixed functions of time) often perform quite poorly. In contrast, our recently proposed class of "search then converge" learning rate schedules (Darken and Moody, 1990) display the theoretically optimal asymptotic convergence rate and a superior ability to escape from poor local minima. However, the user is responsible for setting a key parameter. We propose here a new methodology for creating the first completely automatic adaptive learning rates which achieve the optimal rate of convergence.
منابع مشابه
Distributed gradient for multi-robot motion planning
Distributed stochastic search is proposed for cooperative behavior in multi-robot systems. Distributed gradient is examined. This method consists of multiple stochastic search algorithms that start from different points in the solutions space and interact to each other while moving towards the goal position. Distributed gradient is shown to be efficient when the motion of the robots towards the...
متن کاملStochastic Strictly Contractive Peaceman-Rachford Splitting Method
In this paper, we propose a couple of new Stochastic Strictly Contractive PeacemanRachford Splitting Method (SCPRSM), called Stochastic SCPRSM (SS-PRSM) and Stochastic Conjugate Gradient SCPRSM (SCG-PRSM) for large-scale optimization problems. The two types of Stochastic PRSM algorithms respectively incorporate stochastic variance reduced gradient (SVRG) and conjugate gradient method. Stochasti...
متن کاملTowards Stochastic Conjugate Gradient Methods
The method of conjugate gradients provides a very effective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore a number of ways to adopt ideas from conjugate gradient in the stochastic setting, using fast Hessian-vector products to obtain curvature information cheaply. I...
متن کاملAccelerating Stochastic Gradient Descent
There is widespread sentiment that fast gradient methods (e.g. Nesterov’s acceleration, conjugate gradient, heavy ball) are not effective for the purposes of stochastic optimization due to their instability and error accumulation. Numerous works have attempted to quantify these instabilities in the face of either statistical or non-statistical errors (Paige, 1971; Proakis, 1974; Polyak, 1987; G...
متن کاملOnline Limited-Memory BFGS for Click-Through Rate Prediction
We study the problem of click-through rate (CTR) prediction, where the goal is to predict the probability that a user will click on a search advertisement given information about his issued query and account. In this paper, we formulate a model for CTR prediction using logistic regression, then assess the performance of stochastic gradient descent (SGD) and online limited-memory BFGS (oLBFGS) f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1991