نتایج جستجو برای: gradient descent

تعداد نتایج: 137892  

Journal: :CoRR 2017
Prateek Jain Sham M. Kakade Rahul Kidambi Praneeth Netrapalli Aaron Sidford

There is widespread sentiment that fast gradient methods (e.g. Nesterov’s acceleration, conjugate gradient, heavy ball) are not effective for the purposes of stochastic optimization due to their instability and error accumulation. Numerous works have attempted to quantify these instabilities in the face of either statistical or non-statistical errors (Paige, 1971; Proakis, 1974; Polyak, 1987; G...

2016
Michael Tetelman

In Bayesian approach to probabilistic modeling of data we select a model for probabilities of data that depends on a continuous vector of parameters. For a given data set Bayesian theorem gives a probability distribution of the model parameters. Then the inference of outcomes and probabilities of new data could be found by averaging over the parameter distribution of the model, which is an intr...

2005
Kim L. Blackmore Robert C. Williamson Iven M. Y. Mareels

Stepwise Gradient Descent (SGD) algorithms for online optimization converge to local minima of the relevant cost function. In this paper a globally convergent modification of SGD is proposed, in which several solutions of SGD are run in parallel, together with online estimates of the cost function and its gradient. As each SGD estimate reaches a local minimum of the cost, the fitness of the mem...

2007
Cun-Hui ZHANG

This article derives characterizations and computational algorithms for continuous general gradient descent trajectories in high-dimensional parameter spaces for statistical model selection, prediction, and classification. Examples include proportional gradient shrinkage as an extension of LASSO and LARS, threshold gradient descent with right-continuous variable selectors, threshold ridge regre...

2018
Dan Alistarh Zeyuan Allen-Zhu Jerry Li

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the m machines which allegedly compute stochastic gradients every iteration, an α-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds ε-approximate minimizers of convex functions in T = Õ ( 1...

2016
Maohua Zhu Yuan Xie Minsoo Rhu Jason Clemons Stephen W. Keckler

Prior work has demonstrated that exploiting the sparsity can dramatically improve the energy efficiency and reduce the memory footprint of Convolutional Neural Networks (CNNs). However, these sparsity-centric optimization techniques might be less effective for Long Short-Term Memory (LSTM) based Recurrent Neural Networks (RNNs), especially for the training phase, because of the significant stru...

2016
Philip S. Thomas Bruno Castro da Silva Christoph Dann Emma Brunskill

We propose a new class of algorithms for minimizing or maximizing functions of parametric probabilistic models. These new algorithms are natural gradient algorithms that leverage more information than prior methods by using a new metric tensor in place of the commonly used Fisher information matrix. This new metric tensor is derived by computing directions of steepest ascent where the distance ...

2018
Samuel L. Smith Quoc V. Le

We consider two questions at the heart of machine learning; how can we predict if a minimum will generalize to the test set, and why does stochastic gradient descent find minima that generalize well? Our work responds to Zhang et al. (2016), who showed deep neural networks can easily memorize randomly labeled training data, despite generalizing well on real labels of the same inputs. We show th...

2014
George Papamakarios

Gradient-based optimization methods are popular in machine learning applications. In large-scale problems, stochastic methods are preferred due to their good scaling properties. In this project, we compare the performance of four gradient-based methods; gradient descent, stochastic gradient descent, semi-stochastic gradient descent and stochastic average gradient. We consider logistic regressio...

2004
Mengjie Zhang William D. Smart

This paper describes an approach to the use of gradient descent search in genetic programming (GP) for object classification problems. Gradient descent search is introduced to the GP mechanism and is embedded into the genetic beam search, which allows the evolutionary learning process to globally follow the beam search and locally follow the gradient descent search. Two different methods, an on...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید