نتایج جستجو برای: stochastic gradient descent

تعداد نتایج: 258150  

Journal: :CoRR 2017
Xiangru Lian Wei Zhang Ce Zhang Ji Liu

Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms and has been widely used in many popular machine learning softwares and solvers based on centrali...

Journal: :Journal of Machine Learning Research 2017
Stephan Mandt Matthew D. Hoffman David M. Blei

Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a Markov chain with a stationary distribution. With this perspective, we derive several new results. (1) We show that constant SGD can be used as an approximate Bayesian posterior inference algorithm. Specifically, we show how to adjust the tuning parameters of constant SGD to best match the stationary distributi...

Journal: :CoRR 2015
Panos Toulis Dustin Tran Edoardo M. Airoldi

Iterative procedures for parameter estimation based on stochastic gradient descent allow the estimation to scale to massive data sets. However, in both theory and practice, they suffer from numerical instability. Moreover, they are statistically inefficient as estimators of the true parameter value. To address these two issues, we propose a new iterative procedure termed AISGD. For statistical ...

Journal: :CoRR 2017
Ilja Kuzborskij Christoph H. Lampert

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD) and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worstcase constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non...

1996
W. B. Gong

This paper proves convergence of a sample-path based stochastic gradient-descent algorithm for optimizing expected-value performance measures in discrete event systems. The algorithm uses increasing precision at successive iterations, and it moves against the direction of a generalized gradient of the computed sample performance function. Two convergence results are established: one, for the ca...

1998
Vivek K Goyal Martin Vetterli

The problem of computing the eigendecomposition of an N N symmetric matrix is cast as an unconstrained minimization of either of two performance measures. The K = N(N 1)=2 independent parameters represent angles of distinct Givens rotations. Gradient descent is applied to the minimization problem, step size bounds for local convergence are given, and similarities to LMS adaptive filtering are n...

2005
Robert E. Mahony Krzysztof A. Krakowski Robert C. Williamson

We consider the intrinsic geometry of stochastic gradient descent (SG) algorithms. We show how to derive SG algorithms that fully respect an underlying geometry which can be induced by either prior knowledge in the form of a preferential structure or a generative model via the Fisher information metric. We show that using the geometrically motivated update and the “correct” loss function, the i...

Journal: :CoRR 2011
Tomaso A. Poggio Stephen Voinea Lorenzo Rosasco

In batch learning, stability together with existence and uniqueness of the solution corresponds to well-posedness of Empirical Risk Minimization (ERM) methods; recently, it was proved that CVloo stability is necessary and sufficient for generalization and consistency of ERM ([9]). In this note, we introduce CVon stability, which plays a similar role in online learning. We show that stochastic g...

2011
Sangkyun Lee Christian Bockermann

Stochastic gradient descent methods have been quite successful for solving largescale and online learning problems. We provide a simple parallel framework to obtain solutions of high confidence, where the confidence can be easily controlled by the number of processes, independently of the length of learning processes. Our framework is implemented as a scalable open-source software which can be ...

Journal: :JoWUA 2016
István Hegedüs Árpád Berta Márk Jelasity

Stochastic gradient descent (SGD) is one of the most applied machine learning algorithms in unreliable large-scale decentralized environments. In this type of environment data privacy is a fundamental concern. The most popular way to investigate this topic is based on the framework of differential privacy. However, many important implementation details and the performance of differentially priv...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید