policy iterations

نتایج جستجو برای: policy iterations

تعداد نتایج: 276392 فیلتر نتایج به سال:

Finite-sample analysis of least-squares policy iteration

Journal: :Journal of Machine Learning Research 2012

Alessandro Lazaric Mohammad Ghavamzadeh Rémi Munos

In this paper, we report a performance bound for the widely used least-squares policy iteration (LSPI) algorithm. We first consider the problem of policy evaluation in reinforcement learning, that is, learning the value function of a fixed policy, using the least-squares temporal-difference (LSTD) learning method, and report finite-sample analysis for this algorithm. To do so, we first derive a...

متن کامل

Loop-Free Backpressure Routing Using Link-Reversal Algorithms

2014

The backpressure routing policy is known to be a throughput optimal policy that supports any feasible traffic demand in data networks, but may have poor delay performance when packets traverse loops in the network. In this paper, we study loop-free backpressure routing policies that forward packets along directed acyclic graphs (DAGs) to avoid the looping problem. These policies use link revers...

متن کامل

Dynamic Locomotion Skills for Obstacle Sequences Using Reinforcement Learning

2015

X. B. Peng G. Berseth M. van de Panne

Most locomotion control strategies are developed for flat terrain. We explore the use of reinforcement learning to develop motor skills for the highly dynamic traversal of terrains having sequences of gaps, walls, and steps. Results are demonstrated using simulations of a 21-link planar dog and a 7-link planar biped. Our approach is characterized by: non-parametric representation of the value f...

متن کامل

Convergence of the Iterative Methods

1995

Jeng Yen Linda R. Petzold

In a previous paper, we introduced a coordinate-splitting (CS) form of the equations of motion for multibody systems which together with a modiied nonlinear iteration (CM), is particularly eeective in the solution of certain nonlinear highly oscillatory systems. In this paper, we examine the convergence of the CS and CM iterations and explain the improved convergence of the CM iteration. An exa...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید

Finite-sample analysis of least-squares policy iteration

Loop-Free Backpressure Routing Using Link-Reversal Algorithms

Dynamic Locomotion Skills for Obstacle Sequences Using Reinforcement Learning

Convergence of the Iterative Methods

Asynchronous iterations with flexible communication: contracting operators

Zolotarev Iterations for the Matrix Square Root

Solving Kepler’s equation with CORDIC double iterations

Cubically Convergent Iterations for Invariant Subspace Computation

Complex-Parameter Integral Iterations of Caratheodory Maps

Random iterations of homeomorphisms on the circle