stochastic gradient descent

نتایج جستجو برای: stochastic gradient descent

تعداد نتایج: 258150 فیلتر نتایج به سال:

Conditional Accelerated Lazy Stochastic Gradient Descent

2017

Guanghui Lan Sebastian Pokutta Yi Zhou Daniel Zink

In this work we introduce a conditional accelerated lazy stochastic gradient descent algorithm with optimal number of calls to a stochastic first-order oracle and convergence rate O( 1 ε2 ) improving over the projection-free, Online Frank-Wolfe based stochastic gradient descent of Hazan and Kale [2012] with convergence rate O( 1 ε4 ).

متن کامل

AG-SGD: Angle-Based Stochastic Gradient Descent

Journal: :IEEE Access 2021

متن کامل

Semi-Stochastic Gradient Descent Methods

Journal: :CoRR 2017

Jakub Konecný Peter Richtárik

In this paper we study the problem of minimizing the average of a large number (n) of smooth convex loss functions. We propose a new method, S2GD (Semi-Stochastic Gradient Descent), which runs for one or several epochs in each of which a single full gradient and a random number of stochastic gradients is computed, following a geometric law. The total work needed for the method to output an ε-ac...

متن کامل

Asynchronous Accelerated Stochastic Gradient Descent

2016

Qi Meng Wei Chen Jingcheng Yu Taifeng Wang Zhiming Ma Tie-Yan Liu

Stochastic gradient descent (SGD) is a widely used optimization algorithm in machine learning. In order to accelerate the convergence of SGD, a few advanced techniques have been developed in recent years, including variance reduction, stochastic coordinate sampling, and Nesterov’s acceleration method. Furthermore, in order to improve the training speed and/or leverage larger-scale training data...

متن کامل

On the insufficiency of existing momentum schemes for Stochastic Optimization

2018

Rahul Kidambi Praneeth Netrapalli Prateek Jain Sham M. Kakade

Momentum based stochastic gradient methods such as heavy ball (HB) and Nesterov’s accelerated gradient descent (NAG) method are widely used in practice for training deep networks and other supervised learning models, as they often provide significant improvements over stochastic gradient descent (SGD). Rigorously speaking, “fast gradient” methods have provable improvements over gradient descent...

متن کامل

On the Insufficiency of Existing Momentum Schemes for Stochastic Optimization

2018

Momentum based stochastic gradient methods such as heavy ball (HB) and Nesterov’s accelerated gradient descent (NAG) method are widely used in practice for training deep networks and other supervised learning models, as they often provide significant improvements over stochastic gradient descent (SGD). In general, “fast gradient” methods have provable improvements over gradient descent only for...

متن کامل

Adaptativity of Stochastic Gradient Descent

2015

Aymeric Dieuleveut Francis Bach

We consider the random-design least-squares regression problem within the reproducing kernel Hilbert space (RKHS) framework. Given a stream of independent and identically distributed input/output data, we aim to learn a regression function within an RKHS H, even if the optimal predictor (i.e., the conditional expectation) is not in H. In a stochastic approximation framework where the estimator ...

متن کامل

Parle: parallelizing stochastic gradient descent

Journal: :CoRR 2017

Pratik Chaudhari Carlo Baldassi Riccardo Zecchina Stefano Soatto Ameet Talwalkar

We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4× faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters. We exploit the phenomenon of flat minima that has been...

متن کامل

Stochastic Gradient Descent with GPGPU

2012

David Zastrau Stefan Edelkamp

We show how to optimize a Support Vector Machine and a predictor for Collaborative Filtering with Stochastic Gradient Descent on the GPU, achieving 1.66 to 6-times accelerations compared to a CPUbased implementation. The reference implementations are the Support Vector Machine by Bottou and the BRISMF predictor from the Netflix Prices winning team. Our main idea is to create a hash function of ...

متن کامل

Nonparametric Budgeted Stochastic Gradient Descent

2016

Trung Le Vu Nguyen Tu Dinh Nguyen Dinh Q. Phung

One of the most challenging problems in kernel online learning is to bound the model size. Budgeted kernel online learning addresses this issue by bounding the model size to a predefined budget. However, determining an appropriate value for such predefined budget is arduous. In this paper, we propose the Nonparametric Budgeted Stochastic Gradient Descent that allows the model size to automatica...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید