stochastic gradient descent

نتایج جستجو برای: stochastic gradient descent

تعداد نتایج: 258150 فیلتر نتایج به سال:

Averaging Stochastic Gradient Descent on Riemannian Manifolds

Journal: :CoRR 2018

Nilesh Tripuraneni Nicolas Flammarion Francis Bach Michael I. Jordan

We consider the minimization of a function defined on a Riemannian manifold M accessible only through unbiased estimates of its gradients. We develop a geometric framework to transform a sequence of slowly converging iterates generated from stochastic gradient descent (SGD) on M to an averaged iterate sequence with a robust and fast O(1/n) convergence rate. We then present an application of our...

متن کامل

Asynchronous Stochastic Gradient Descent with Delay Compensation

2017

Shuxin Zheng Qi Meng Taifeng Wang Wei Chen Nenghai Yu Zhiming Ma Tie-Yan Liu

With the fast development of deep learning, people have started to train very big neural networks using massive data. Asynchronous Stochastic Gradient Descent (ASGD) is widely used to fulfill this task, which, however, is known to suffer from the problem of delayed gradient. That is, when a local worker adds the gradient it calculates to the global model, the global model may have been updated ...

متن کامل

Variance Reduced Stochastic Gradient Descent with Neighbors

2015

Thomas Hofmann Aurélien Lucchi Simon Lacoste-Julien Brian McWilliams

Stochastic Gradient Descent (SGD) is a workhorse in machine learning, yet it is also known to be slow relative to steepest descent. The variance in the stochastic update directions only allows for sublinear or (with iterate averaging) linear convergence rates. Recently, variance reduction techniques such as SVRG and SAGA have been proposed to overcome this weakness. With asymptotically vanishin...

متن کامل

Stochastic Gradient Descent on Highly-Parallel Architectures

Journal: :CoRR 2018

Yujing Ma Florin Rusu Martin Torres

There is an increased interest in building data analytics frameworks with advanced algebraic capabilities both in industry and academia. Many of these frameworks, e.g., TensorFlow and BIDMach, implement their computeintensive primitives in two flavors—as multi-thread routines for multi-core CPUs and as highly-parallel kernels executed on GPU. Stochastic gradient descent (SGD) is the most popula...

متن کامل

Batched Stochastic Gradient Descent with Weighted Sampling

Journal: :CoRR 2016

Deanna Needell Rachel Ward

We analyze a batched variant of Stochastic Gradient Descent (SGD) with weighted sampling distribution for smooth and non-smooth objective functions. We show that by distributing the batches computationally, a significant speedup in the convergence rate is provably possible compared to either batched sampling or weighted sampling alone. We propose several computationally efficient schemes to app...

متن کامل

Momentum and Optimal Stochastic Search

1993

Genevieve B. Orr Todd K. Leen

The rate of convergence for gradient descent algorithms, both batch and stochastic, can be improved by including in the weight update a “momentum” term proportional to the previous weight update. Several authors [1, 2] give conditions for convergence of the mean and covariance of the weight vector for momentum LMS with constant learning rate. However stochastic algorithms require that the learn...

متن کامل

High Throughput Synchronous Distributed Stochastic Gradient Descent

2018

Michael Teng Frank Wood

We introduce a new, high-throughput, synchronous, distributed, data-parallel, stochasticgradient-descent learning algorithm. This algorithm uses amortized inference in a computecluster-specific, deep, generative, dynamical model to perform joint posterior predictive inference of the mini-batch gradient computation times of all worker-nodes in a parallel computing cluster. We show that a synchro...

متن کامل

Privacy-preservation for Stochastic Gradient Descent Method

2012

Shuang Wu Jun Sakuma

The traditional paradigm in machine learning has been that given a data set, the goal is to learn a target function or decision model (such as a classifier) from it. Many techniques in data mining and machine learning follow a gradient descent paradigm in the iterative process of discovering this target function or decision model. For instance, Linear regression can be resolved through a gradie...

متن کامل

Local Gain Adaptation in Stochastic Gradient Descent

1999

Nicol N. Schraudolph

Gain adaptation algorithms for neural networks typically adjust learning rates by monitoring the correlation between successive gradients. Here we discuss the limitations of this approach, and develop an alternative by extending Sutton’s work on linear systems to the general, nonlinear case. The resulting online algorithms are computationally little more expensive than other acceleration techni...

متن کامل

Adaptive Variance Reducing for Stochastic Gradient Descent

2016

Zebang Shen Hui Qian Tengfei Zhou Tongzhou Mu

Variance Reducing (VR) stochastic methods are fast-converging alternatives to the classical Stochastic Gradient Descent (SGD) for solving large-scale regularized finite sum problems, especially when a highly accurate solution is required. One critical step in VR is the function sampling. State-of-the-art VR algorithms such as SVRG and SAGA, employ either Uniform Probability (UP) or Importance P...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید