stochastic gradient descent learning

نتایج جستجو برای: stochastic gradient descent learning

تعداد نتایج: 840759 فیلتر نتایج به سال:

On the insufficiency of existing momentum schemes for Stochastic Optimization

2018

Rahul Kidambi Praneeth Netrapalli Prateek Jain Sham M. Kakade

Momentum based stochastic gradient methods such as heavy ball (HB) and Nesterov’s accelerated gradient descent (NAG) method are widely used in practice for training deep networks and other supervised learning models, as they often provide significant improvements over stochastic gradient descent (SGD). Rigorously speaking, “fast gradient” methods have provable improvements over gradient descent...

متن کامل

On the Insufficiency of Existing Momentum Schemes for Stochastic Optimization

2018

Momentum based stochastic gradient methods such as heavy ball (HB) and Nesterov’s accelerated gradient descent (NAG) method are widely used in practice for training deep networks and other supervised learning models, as they often provide significant improvements over stochastic gradient descent (SGD). In general, “fast gradient” methods have provable improvements over gradient descent only for...

متن کامل

Projected Semi-Stochastic Gradient Descent Method with Mini-Batch Scheme under Weak Strong Convexity Assumption

Journal: :CoRR 2016

Jie Liu Martin Takác

We propose a projected semi-stochastic gradient descent method with mini-batch for improving both the theoretical complexity and practical performance of the general stochastic gradient descent method (SGD). We are able to prove linear convergence under weak strong convexity assumption. This requires no strong convexity assumption for minimizing the sum of smooth convex functions subject to a c...

متن کامل

Generalization Bounds for Randomized Learning with Application to Stochastic Gradient Descent

2016

Ben London

Randomized algorithms are central to modern machine learning. In the presence of massive datasets, researchers often turn to stochastic optimization to solve learning problems. Of particular interest is stochastic gradient descent (SGD), a first-order method that approximates the learning objective and gradient by a random point estimate. A classical question in learning theory is, if a randomi...

متن کامل

A Structural Representation Learning for Multi-relational Networks

2017

Lin Liu Xin Li William Kwok-Wai Cheung Chengcheng Xu

Most of the existing multi-relational network embedding methods, e.g., TransE, are formulated to preserve pair-wise connectivity structures in the networks. With the observations that significant triangular connectivity structures and parallelogram connectivity structures found in many real multi-relational networks are often ignored and that a hard-constraint commonly adopted by most of the ne...

متن کامل

Online Learning Rate Adaptation with Hypergradient Descent

Journal: :CoRR 2017

Atilim Gunes Baydin Robert Cornish David Martinez Rubio Mark Schmidt Frank D. Wood

We introduce a general method for improving the convergence rate of gradientbased optimizers that is easy to implement and works well in practice. We demonstrate the effectiveness of the method in a range of optimization problems by applying it to stochastic gradient descent, stochastic gradient descent with Nesterov momentum, and Adam, showing that it significantly reduces the need for the man...

متن کامل

Semi-Stochastic Gradient Descent Methods

Journal: :Frontiers in Applied Mathematics and Statistics 2017

متن کامل

Learning Curves for Stochastic Gradient Descent in Linear Feedforward Networks

Journal: :Neural Computation 2005

متن کامل

Normalized stochastic gradient descent learning of general complex‐valued models

Journal: :Electronics Letters 2021

متن کامل

Uniform Learning in a Deep Neural Network via "Oddball" Stochastic Gradient Descent

Journal: :CoRR 2015

Andrew J. R. Simpson

When training deep neural networks, it is typically assumed that the training examples are uniformly difficult to learn. Or, to restate, it is assumed that the training error will be uniformly distributed across the training examples. Based on these assumptions, each training example is used an equal number of times. However, this assumption may not be valid in many cases. “Oddball SGD” (novelt...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید