نتایج جستجو برای: training iteration

تعداد نتایج: 358779  

2009
Huisheng Zhang Chao Zhang Wei Wu

The batch split-complex backpropagation BSCBP algorithm for training complex-valued neural networks is considered. For constant learning rate, it is proved that the error function of BSCBP algorithm is monotone during the training iteration process, and the gradient of the error function tends to zero. By adding a moderate condition, the weights sequence itself is also proved to be convergent. ...

In this paper, we have proposed a new iterative method for finding the solution of ordinary differential equations of the first order. In this method we have extended the idea of variational iteration method by changing the general Lagrange multiplier which is defined in the context of the variational iteration method.This causes the convergent rate of the method increased compared with the var...

Journal: :Neurocomputing 2008
C. Zhang W. Wu X. H. Chen Y. Xiong

Product unit neural networks with exponential weights (PUNNs) can provide more powerful internal representation capability than traditional feed-forward neural networks. In this paper, a convergence result of the back-propagation (BP) algorithm for training PUNNs is presented. The monotonicity of the error function in the training iteration process is also guaranteed. A numerical example is giv...

The present research aims to design strategic management model in technical and vocational training. The research has qualitative design and for data analysis and classification, thematic Analysis Method (Thematic Network) was applied. The statistical population of the study consisted of experts of the Iran Technical and Vocational Training Organization. The number of 13 participants were selec...

2007
Bruno Scherrer

We consider the discrete-time infinite-horizon optimal control problem formalized by Markov Decision Processes (Puterman, 1994; Bertsekas and Tsitsiklis, 1996). We revisit the work of Bertsekas and Ioffe (1996), that introduced λ Policy Iteration, a family of algorithms parameterized by λ that generalizes the standard algorithms Value Iteration and Policy Iteration, and has some deep connection...

Journal: :Journal of Machine Learning Research 2013
Bruno Scherrer

We consider the discrete-time infinite-horizon optimal control problem formalized by Markov decision processes (Puterman, 1994; Bertsekas and Tsitsiklis, 1996). We revisit the work of Bertsekas and Ioffe (1996), that introduced λ policy iteration—a family of algorithms parametrized by a parameter λ—that generalizes the standard algorithms value and policy iteration, and has some deep connection...

Journal: :CoRR 2015
Denis Steckelmacher Peter Vrancx

This paper explores the performance of fitted neural Q iteration for reinforcement learning in several partially observable environments, using three recurrent neural network architectures: Long ShortTerm Memory [7], Gated Recurrent Unit [3] and MUT1, a recurrent neural architecture evolved from a pool of several thousands candidate architectures [8]. A variant of fitted Q iteration, based on A...

1999
V. P. Plagianakos

Adaptive learning rate algorithms try to decrease the error at each iteration by searching a local minimum with small weight steps, which are usually constrained by highly problemdependent heuristic learning parameters. Based on the idea of the decrease of the error function at each iteration we suggest monotone learning strategies that guarantee convergence to a minimizer of the error function...

2009
Shai Shalev-Shwartz Ambuj Tewari

We describe and analyze two stochastic methods for `1 regularized loss minimization problems, such as the Lasso. The first method updates the weight of a single feature at each iteration while the second method updates the entire weight vector but only uses a single training example at each iteration. In both methods, the choice of feature/example is uniformly at random. Our theoretical runtime...

2008
Bibhas Chakraborty Victor Strecher Susan Murphy

We consider finite-horizon fitted Q-iteration with linear function approximation to learn a policy from a training set of trajectories. We show that fitted Q-iteration can give biased estimates and invalid confidence intervals for the parameters that feature in the policy. We propose a regularized estimator called soft-threshold estimator, derive it as an approximate empirical Bayes estimator, ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید