نتایج جستجو برای: gradient descent

تعداد نتایج: 137892  

Journal: :CoRR 2016
Sebastian Ruder

Gradient descent optimization algorithms, while increasingly popular, are often used as black-box optimizers, as practical explanations of their strengths and weaknesses are hard to come by. This article aims to provide the reader with intuitions with regard to the behaviour of different algorithms that will allow her to put them to use. In the course of this overview, we look at different vari...

1999
UDO SEIFFERT

Backpropagation is the standard training procedure for Multiple Layer Perceptron networks. It is based on the gradient descent to minimize the network error. However, using the gradient descent algorithm leads to some problems with the convergence of the training at all and to restrictions concerning applicable transfer functions as well. This paper describes a complete substitution of the grad...

2017
Chao Ma

Traditional learning algorithms with gradient descent based technique, such as back-propagation (BP) and its variant Levenberg-Marquardt (LM) have been widely used in the training of multilayer feedforward neural networks. The gradient descent based algorithm may converge usually slower than required time in training, since many iterative learning step are needed by such learning algorithm, and...

2013
Tianbao Yang

We present and study a distributed optimization algorithm by employing a stochastic dual coordinate ascent method. Stochastic dual coordinate ascent methods enjoy strong theoretical guarantees and often have better performances than stochastic gradient descent methods in optimizing regularized loss minimization problems. It still lacks of efforts in studying them in a distributed framework. We ...

Journal: :CoRR 2014
H. Brendan McMahan

We present tools for the analysis of Follow-The-Regularized-Leader (FTRL), Dual Averaging, and Mirror Descent algorithms when the regularizer (equivalently, proxfunction or learning rate schedule) is chosen adaptively based on the data. Adaptivity can be used to prove regret bounds that hold on every round, and also allows for data-dependent regret bounds as in AdaGrad-style algorithms (e.g., O...

Journal: :CoRR 2016
Yan Wang Ge Ou

We propose a novel support vector regression approach called e-Distance Weighted Support Vector Regression (e-DWSVR). e-DWSVR specifically addresses two challenging issues in support vector regression: first, the process of noisy data; second, how to deal with the situation when the distribution of boundary data is different from that of the overall data. The proposed e-DWSVR optimizes the mini...

Journal: :CoRR 2015
Andrew J. R. Simpson

Despite the promise of brain-inspired machine learning, deep neural networks (DNN) have frustratingly failed to bridge the deceptively large gap between learning and memory. Here, we introduce a Perpetual Learning Machine; a new type of DNN that is capable of brain-like dynamic ‘on the fly’ learning because it exists in a self-supervised state of Perpetual Stochastic Gradient Descent. Thus, we ...

Journal: :CoRR 2015
Andrew J. R. Simpson

When training deep neural networks, it is typically assumed that the training examples are uniformly difficult to learn. Or, to restate, it is assumed that the training error will be uniformly distributed across the training examples. Based on these assumptions, each training example is used an equal number of times. However, this assumption may not be valid in many cases. “Oddball SGD” (novelt...

2014
Wenbin Cai Ya Zhang Siyuan Zhou Wenquan Wang Chris H. Q. Ding Xiao Gu

Margin-based strategies and model change based strategies represent two important types of strategies for active learning. While margin-based strategies have been dominant for Support Vector Machines (SVMs), most methods are based on heuristics and lack a solid theoretical support. In this paper, we propose an active learning strategy for SVMs based on Maximum Model Change (MMC). The model chan...

Journal: :Electr. J. Comb. 2014
Jason P. Smith

The set of all permutations, ordered by pattern containment, is a poset. We give a formula for the Möbius function of intervals [1, π] in this poset, for any permutation π with at most one descent. We compute the Möbius function as a function of the number and positions of pairs of consecutive letters in π that are consecutive in value. As a result of this we show that the Möbius function is un...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید