stochastic gradient descent

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Journal: :CoRR 2017

Xiangru Lian Wei Zhang Ce Zhang Ji Liu

Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms and has been widely used in many popular machine learning softwares and solvers based on centrali...

متن کامل

Stochastic Gradient Descent as Approximate Bayesian Inference

Journal: :Journal of Machine Learning Research 2017

Stephan Mandt Matthew D. Hoffman David M. Blei

Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a Markov chain with a stationary distribution. With this perspective, we derive several new results. (1) We show that constant SGD can be used as an approximate Bayesian posterior inference algorithm. Specifically, we show how to adjust the tuning parameters of constant SGD to best match the stationary distributi...

متن کامل

Stability and optimality in stochastic gradient descent

Journal: :CoRR 2015

Panos Toulis Dustin Tran Edoardo M. Airoldi

Iterative procedures for parameter estimation based on stochastic gradient descent allow the estimation to scale to massive data sets. However, in both theory and practice, they suffer from numerical instability. Moreover, they are statistically inefficient as estimators of the true parameter value. To address these two issues, we propose a new iterative procedure termed AISGD. For statistical ...

متن کامل

Data-Dependent Stability of Stochastic Gradient Descent

Journal: :CoRR 2017

Ilja Kuzborskij Christoph H. Lampert

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD) and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worstcase constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non...

متن کامل

Convergence Analysis of Gradient Descent Stochastic Algorithms

1996

W. B. Gong

This paper proves convergence of a sample-path based stochastic gradient-descent algorithm for optimizing expected-value performance measures in discrete event systems. The algorithm uses increasing precision at successive iterations, and it moves against the direction of a generalized gradient of the computed sample performance function. Two convergence results are established: one, for the ca...

متن کامل

Block Transform Adaptation by Stochastic Gradient Descent

1998

Vivek K Goyal Martin Vetterli

The problem of computing the eigendecomposition of an N N symmetric matrix is cast as an unconstrained minimization of either of two performance measures. The K = N(N 1)=2 independent parameters represent angles of distinct Givens rotations. Gradient descent is applied to the minimization problem, step size bounds for local convergence are given, and similarities to LMS adaptive filtering are n...

متن کامل

Intrinsic Geometry of Stochastic Gradient Descent Algorithms

2005

Robert E. Mahony Krzysztof A. Krakowski Robert C. Williamson

We consider the intrinsic geometry of stochastic gradient descent (SG) algorithms. We show how to derive SG algorithms that fully respect an underlying geometry which can be induced by either prior knowledge in the form of a preferential structure or a generative model via the Fisher information metric. We show that using the geometrically motivated update and the “correct” loss function, the i...

متن کامل

Online Learning, Stability, and Stochastic Gradient Descent

Journal: :CoRR 2011

Tomaso A. Poggio Stephen Voinea Lorenzo Rosasco

In batch learning, stability together with existence and uniqueness of the solution corresponds to well-posedness of Empirical Risk Minimization (ERM) methods; recently, it was proved that CVloo stability is necessary and sufficient for generalization and consistency of ERM ([9]). In this note, we introduce CVon stability, which plays a similar role in online learning. We show that stochastic g...

متن کامل

Scalable Stochastic Gradient Descent with Improved Confidence

2011

Sangkyun Lee Christian Bockermann

Stochastic gradient descent methods have been quite successful for solving largescale and online learning problems. We provide a simple parallel framework to obtain solutions of high confidence, where the confidence can be easily controlled by the number of processes, independently of the length of learning processes. Our framework is implemented as a scalable open-source software which can be ...

متن کامل

Robust Decentralized Differentially Private Stochastic Gradient Descent

Journal: :JoWUA 2016

István Hegedüs Árpád Berta Márk Jelasity

Stochastic gradient descent (SGD) is one of the most applied machine learning algorithms in unreliable large-scale decentralized environments. In this type of environment data privacy is a fundamental concern. The most popular way to investigate this topic is based on the framework of differential privacy. However, many important implementation details and the performance of differentially priv...

متن کامل