stochastic gradient descent learning

Mproving S Tochastic G Radient D Escent with F Eedback

2016

Hiroaki Hayashi

In this paper we propose a simple and efficient method for improving stochastic gradient descent methods by using feedback from the objective function. The method tracks the relative changes in the objective function with a running average, and uses it to adaptively tune the learning rate in stochastic gradient descent. We specifically apply this idea to modify Adam, a popular algorithm for tra...

متن کامل

Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning

2018

Yao Zhang Andrew M. Saxe Madhu S. Advani Alpha A. Lee

Finding parameters that minimise a loss function is at the core of many machine learning methods. The Stochastic Gradient Descent algorithm is widely used and delivers state of the art results for many problems. Nonetheless, Stochastic Gradient Descent typically cannot find the global minimum, thus its empirical effectiveness is hitherto mysterious. We derive a correspondence between parameter ...

متن کامل

Without-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization

Journal: :CoRR 2016

Ohad Shamir

Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled with replacement. In practice, however, sampling without replacement is very common, easier to implement in many cases, and often performs better. In this paper, we provide competitive convergence guarantees for without-replacement sampling, under various scenarios, f...

متن کامل

Improving Stochastic Gradient Descent with Feedback

Journal: :CoRR 2016

Jayanth Koushik Hiroaki Hayashi

In this paper we propose a simple and efficient method for improving stochastic gradient descent methods by using feedback from the objective function. The method tracks the relative changes in the objective function with a running average, and uses it to adaptively tune the learning rate in stochastic gradient descent. We specifically apply this idea to modify Adam, a popular algorithm for tra...

متن کامل

Stochastic modified equations for the asynchronous stochastic gradient descent

Journal: :Information and Inference: A Journal of the IMA 2019

متن کامل

Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences

2012

Benjamin Recht Christopher Ré

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling withand without-replacement in such algorithms. Focusing on least means squares...

متن کامل

Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning

Journal: :CoRR 2017

Fanhua Shang Yuanyuan Liu James Cheng Jiacheng Zhuo

Recently, research on accelerated stochastic gradient descentmethods (e.g., SVRG) has made exciting progress (e.g., lin-ear convergence for strongly convex problems). However,the best-known methods (e.g., Katyusha) requires at leasttwo auxiliary variables and two momentum parameters. Inthis paper, we propose a fast stochastic variance reductiongradient (FSVRG) method...

متن کامل

Continuous-Time Limit of Stochastic Gradient Descent Revisited

2015

Stephan Mandt Matthew D. Hoffman David M. Blei

Stochastic Gradient Descent (SGD) is an important algorithm in machine learning. With constant learning rates, it is a stochastic process that reaches a stationary distribution. We revisit an analysis of SGD in terms of stochastic differential equations in the limit of small constant gradient steps. This limit, which we feel is not appreciated in the machine learning community, allows us to app...

متن کامل

A Stochastic Gradient Descent Approach for Stochastic Optimal Control

Journal: :East Asian Journal on Applied Mathematics 2020

متن کامل

Identification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network

Journal: Journal of Artificial Intelligence and Data Mining 2020

Gh. Ahmadi, M. Teshnelab,

Because of the existing interactions among the variables of a multiple input-multiple output (MIMO) nonlinear system, its identification is a difficult task, particularly in the presence of uncertainties. Cement rotary kiln (CRK) is a MIMO nonlinear system in the cement factory with a complicated mechanism and uncertain disturbances. The identification of CRK is very important for different pur...

متن کامل