Stochastic Gradient Descent

نتایج جستجو برای: Stochastic Gradient Descent

تعداد نتایج: 258150 فیلتر نتایج به سال:

Preconditioned Stochastic Gradient Descent

Journal: :IEEE Transactions on Neural Networks and Learning Systems 2018

متن کامل

Semi-Stochastic Gradient Descent Methods

Journal: :Frontiers in Applied Mathematics and Statistics 2017

متن کامل

Comparison of Modern Stochastic Optimization Algorithms

2014

George Papamakarios

Gradient-based optimization methods are popular in machine learning applications. In large-scale problems, stochastic methods are preferred due to their good scaling properties. In this project, we compare the performance of four gradient-based methods; gradient descent, stochastic gradient descent, semi-stochastic gradient descent and stochastic average gradient. We consider logistic regressio...

متن کامل

Variational Stochastic Gradient Descent

2016

Michael Tetelman

In Bayesian approach to probabilistic modeling of data we select a model for probabilities of data that depends on a continuous vector of parameters. For a given data set Bayesian theorem gives a probability distribution of the model parameters. Then the inference of outcomes and probabilities of new data could be found by averaging over the parameter distribution of the model, which is an intr...

متن کامل

Byzantine Stochastic Gradient Descent

2018

Dan Alistarh Zeyuan Allen-Zhu Jerry Li

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the m machines which allegedly compute stochastic gradients every iteration, an α-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds ε-approximate minimizers of convex functions in T = Õ ( 1...

متن کامل

Parallelized Stochastic Gradient Descent

2010

Martin Zinkevich Markus Weimer Alexander J. Smola Lihong Li

With the increase in available data parallel machine learning has become an in-creasingly pressing problem. In this paper we present the first parallel stochasticgradient descent algorithm including a detailed analysis and experimental evi-dence. Unlike prior work on parallel optimization algorithms [5, 7] our variantcomes with parallel acceleration guarantees and it poses n...

متن کامل

Preconditioned Stochastic Gradient Descent

Journal: :IEEE transactions on neural networks and learning systems 2017

Xi-Lin Li

Stochastic gradient descent (SGD) still is the workhorse for many practical problems. However, it converges slow, and can be difficult to tune. It is possible to precondition SGD to accelerate its convergence remarkably. But many attempts in this direction either aim at solving specialized problems, or result in significantly more complicated methods than SGD. This paper proposes a new method t...

متن کامل

Stochastic Gradient Descent Tricks

2012

Léon Bottou

Chapter 1 strongly advocates the stochastic back-propagation method to train neural networks. This is in fact an instance of a more general technique called stochastic gradient descent (SGD). This chapter provides background material, explains why SGD is a good learning algorithm when the training set is large, and provides useful recommendations.

متن کامل

Accelerating Stochastic Gradient Descent

Journal: :CoRR 2017

Prateek Jain Sham M. Kakade Rahul Kidambi Praneeth Netrapalli Aaron Sidford

There is widespread sentiment that fast gradient methods (e.g. Nesterov’s acceleration, conjugate gradient, heavy ball) are not effective for the purposes of stochastic optimization due to their instability and error accumulation. Numerous works have attempted to quantify these instabilities in the face of either statistical or non-statistical errors (Paige, 1971; Proakis, 1974; Polyak, 1987; G...

متن کامل

Sified Stochastic Gradient Descent

2016

Maohua Zhu Yuan Xie Minsoo Rhu Jason Clemons Stephen W. Keckler

Prior work has demonstrated that exploiting the sparsity can dramatically improve the energy efficiency and reduce the memory footprint of Convolutional Neural Networks (CNNs). However, these sparsity-centric optimization techniques might be less effective for Long Short-Term Memory (LSTM) based Recurrent Neural Networks (RNNs), especially for the training phase, because of the significant stru...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید