stochastic gradient descent learning

نتایج جستجو برای: stochastic gradient descent learning

تعداد نتایج: 840759 فیلتر نتایج به سال:

A Fixed-Point of View on Gradient Methods for Big Data

2017

Alexander Jung

Interpreting gradient methods as fixed-point iterations, we provide a detailed analysis of those methods for minimizing convex objective functions. Due to their conceptual and algorithmic simplicity, gradient methods are widely used in machine learning for massive data sets (big data). In particular, stochastic gradient methods are considered the de-facto standard for training deep neural netwo...

متن کامل

On the Convergence of SGD Training of Neural Networks

Journal: :CoRR 2015

Thomas M. Breuel

Neural networks are usually trained by some form of stochastic gradient descent (SGD)). A number of strategies are in common use intended to improve SGD optimization, such as learning rate schedules, momentum, and batching. These are motivated by ideas about the occurrence of local minima at different scales, valleys, and other phenomena in the objective function. Empirical results presented he...

متن کامل

Reinforcement Learning for Learning Rate Control

Journal: :CoRR 2017

Chang Xu Tao Qin Gang Wang Tie-Yan Liu

Stochastic gradient descent (SGD), which updates the model parameters by adding a local gradient times a learning rate at each step, is widely used in model training of machine learning algorithms such as neural networks. It is observed that the models trained by SGD are sensitive to learning rates and good learning rates are problem specific. We propose an algorithm to automatically learn lear...

متن کامل

Global Learning of Neural Networks by Using Hybrid Optimization Algorithm

2007

Yong-Hyun Cho Seong-Jun Hong

This paper proposes a global learning of neural networks by hybrid optimization algorithm. The hybrid algorithm combines a stochastic approximation with a gradient descent. The stochastic approximation is first applied for estimating an approximation point inclined toward a global escaping from a local minimum, and then the backpropagation(BP) algorithm is applied for high-speed convergence as ...

متن کامل

Microscale Adaptive Optics: Wave-Front Control with a mu-Mirror Array and a VLSI Stochastic Gradient Descent Controller.

Journal: :Applied optics 2001

T Weyrauch M A Vorontsov T G Bifano J A Hammer M Cohen G Cauwenberghs

The performance of adaptive systems that consist of microscale on-chip elements [microelectromechanical mirror (mu-mirror) arrays and a VLSI stochastic gradient descent microelectronic control system] is analyzed. The mu-mirror arrays with 5 x 5 and 6 x 6 actuators were driven with a control system composed of two mixed-mode VLSI chips implementing model-free beam-quality metric optimization by...

متن کامل

Variational Stochastic Gradient Descent

2016

Michael Tetelman

In Bayesian approach to probabilistic modeling of data we select a model for probabilities of data that depends on a continuous vector of parameters. For a given data set Bayesian theorem gives a probability distribution of the model parameters. Then the inference of outcomes and probabilities of new data could be found by averaging over the parameter distribution of the model, which is an intr...

متن کامل

Byzantine Stochastic Gradient Descent

2018

Dan Alistarh Zeyuan Allen-Zhu Jerry Li

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the m machines which allegedly compute stochastic gradients every iteration, an α-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds ε-approximate minimizers of convex functions in T = Õ ( 1...

متن کامل

When is a Convolutional Filter Easy To Learn?

Journal: :CoRR 2017

Simon S. Du Jason D. Lee Yuandong Tian

We analyze the convergence of (stochastic) gradient descent algorithm for learning a convolutional filter with Rectified Linear Unit (ReLU) activation function. Our analysis does not rely on any specific form of the input distribution and our proofs only use the definition of ReLU, in contrast with previous works that are restricted to standard Gaussian input. We show that (stochastic) gradient...

متن کامل

Parallelized Stochastic Gradient Descent

2010

Martin Zinkevich Markus Weimer Alexander J. Smola Lihong Li

With the increase in available data parallel machine learning has become an in-creasingly pressing problem. In this paper we present the first parallel stochasticgradient descent algorithm including a detailed analysis and experimental evi-dence. Unlike prior work on parallel optimization algorithms [5, 7] our variantcomes with parallel acceleration guarantees and it poses n...

متن کامل

Preconditioned Stochastic Gradient Descent

Journal: :IEEE transactions on neural networks and learning systems 2017

Xi-Lin Li

Stochastic gradient descent (SGD) still is the workhorse for many practical problems. However, it converges slow, and can be difficult to tune. It is possible to precondition SGD to accelerate its convergence remarkably. But many attempts in this direction either aim at solving specialized problems, or result in significantly more complicated methods than SGD. This paper proposes a new method t...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید