gradient descent

Semi-Stochastic Gradient Descent Methods

Journal: :CoRR 2017

Jakub Konecný Peter Richtárik

In this paper we study the problem of minimizing the average of a large number (n) of smooth convex loss functions. We propose a new method, S2GD (Semi-Stochastic Gradient Descent), which runs for one or several epochs in each of which a single full gradient and a random number of stochastic gradients is computed, following a geometric law. The total work needed for the method to output an ε-ac...

متن کامل

Learning by Online Gradient Descent

1995

Michael Biehl

We study online gradient{descent learning in multilayer networks analytically and numerically. The training is based on randomly drawn inputs and their corresponding outputs as deened by a target rule. In the thermo-dynamic limit we derive deterministic diierential equations for the order parameters of the problem which allow an exact calculation of the evolution of the generalization error. Fi...

متن کامل

Asynchronous Accelerated Stochastic Gradient Descent

2016

Qi Meng Wei Chen Jingcheng Yu Taifeng Wang Zhiming Ma Tie-Yan Liu

Stochastic gradient descent (SGD) is a widely used optimization algorithm in machine learning. In order to accelerate the convergence of SGD, a few advanced techniques have been developed in recent years, including variance reduction, stochastic coordinate sampling, and Nesterov’s acceleration method. Furthermore, in order to improve the training speed and/or leverage larger-scale training data...

متن کامل

Tunnelling Descent: A New Algorithm for Active Contour Segmentation of Ultrasound Images

Journal: :Information processing in medical imaging : proceedings of the ... conference 2003

Zhong Tao C. Carl Jaffe Hemant D. Tagare

The presence of speckle in ultrasound images makes it hard to segment them using active contours. Speckle causes the energy function of the active contours to have many local minima, and the gradient descent procedure used for evolving the contour gets trapped in these minima. A new algorithm, called tunnelling descent, is proposed in this paper for evolving active contours. Tunnelling descent ...

متن کامل

Conjugate gradient neural network in prediction of clay behavior and parameters sensitivities

Journal: Numerical Methods in Civil Engineering 2016

Hamed Memarian fard,

The use of artificial neural networks has increased in many areas of engineering. In particular, this method has been applied to many geotechnical engineering problems and demonstrated some degree of success. A review of the literature reveals that it has been used successfully in modeling soil behavior, site characterization, earth retaining structures, settlement of structures, slope stabilit...

متن کامل

Parle: parallelizing stochastic gradient descent

Journal: :CoRR 2017

Pratik Chaudhari Carlo Baldassi Riccardo Zecchina Stefano Soatto Ameet Talwalkar

We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4× faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters. We exploit the phenomenon of flat minima that has been...

متن کامل

Boosting Algorithms as Gradient Descent

1999

Llew Mason Jonathan Baxter Peter L. Bartlett Marcus R. Frean

Much recent attention, both experimental and theoretical, has been focussed on classication algorithms which produce voted combinations of classi ers. Recent theoretical work has shown that the impressive generalization performance of algorithms like AdaBoost can be attributed to the classi er having large margins on the training data. We present an abstract algorithm for nding linear combinati...

متن کامل

Online gradient descent learning algorithm†

2007

Yiming Ying Massimiliano Pontil

This paper considers the least-square online gradient descent algorithm in a reproducing kernel Hilbert space (RKHS) without an explicit regularization term. We present a novel capacity independent approach to derive error bounds and convergence results for this algorithm. The essential element in our analysis is the interplay between the generalization error and a weighted cumulative error whi...

متن کامل

Stochastic Gradient Descent with GPGPU

2012

David Zastrau Stefan Edelkamp

We show how to optimize a Support Vector Machine and a predictor for Collaborative Filtering with Stochastic Gradient Descent on the GPU, achieving 1.66 to 6-times accelerations compared to a CPUbased implementation. The reference implementations are the Support Vector Machine by Bottou and the BRISMF predictor from the Netflix Prices winning team. Our main idea is to create a hash function of ...

متن کامل

Scaled Gradient Descent Learning Rate

2004

Kary Främling

Adaptive behaviour through machine learning is challenging in many real-world applications such as robotics. This is because learning has to be rapid enough to be performed in real time and to avoid damage to the robot. Models using linear function approximation are interesting in such tasks because they offer rapid learning and have small memory and processing requirements. Adalines are a simp...

متن کامل