action value function

investigation of pedological criterion on rangeland desertification (case study: south of rude-shoor watershed)

Journal: :desert 2013

m. karimpour reihan s. feiznia a. salehpour jam m.k. kianian

investigation of desertification trend needs understanding of phenomena creating changes singly or action and reaction together in the manner that these changes were ended up in land degradation. in investigation of pedological criterion onland degradation in quaternary rock units, first, a part of the rude-shoor watershed area was selected. after distinguishing target area, maps of slope class...

متن کامل

Decision Boundary Partitioning: Variable Resolution Model-Free Reinforcement Learning

1999

Stuart I. Reynolds

Reinforcement learning agents attempt to learn and construct a decision policy which maximises some reward signal. In turn, this policy is directly derived from long-term value estimates of state-action pairs. In environments with real-valued state-spaces, however, it is impossible to enumerate the value of every state-action pair, necessitating the use of a function approximator in order to in...

متن کامل

Optimal Controller Design Algorithm For Non-Affine in Input Discrete-Time Nonlinear System

2012

A. Al-Tamimi

Convergence is proven of the value-iteration-based algorithm to find the optimal controller in the case of general non-affine in input nonlinear systems. That is, it is shown that algorithm converges to the optimal control and the optimal value function. It is assumed that at each iteration the value and action update equations can be exactly solved. Then two standard neural networks (NN) are u...

متن کامل

Task-Driven Discretization of the Joint Space of Visual Percepts and Continuous Actions

2006

Sébastien Jodogne Justus H. Piater

We target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJC), adaptively discretizes the joint space of visual percepts and continuous actions. In a sequence of attempts to remove perceptual aliasing, it incrementally builds a decision tree that applies tests either in the i...

متن کامل

Model - based Direct Policy Search ( Extended Abstract ) Jan

2010

Jan Hendrik Metzen Frank Kirchner

Scaling Reinforcement Learning (RL) to real-world problems with continuous state and action spaces remains a challenge. This is partly due to the reason that the optimal value function can become quite complex in continuous domains. In this paper, we propose to avoid learning the optimal value function at all but to use direct policy search methods in combination with model-based RL instead.

متن کامل

Collaborative Multiagent Reinforcement Learning by Payoff Propagation

Journal: :Journal of Machine Learning Research 2006

Jelle R. Kok Nikos A. Vlassis

In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of coordination graphs of Guestrin, Koller, and Parr (2002a) which exploits the dependencies between agents to decompose the global payoff function into a sum of local terms. First, we deal with the single-state case and d...

متن کامل

Action Selection and Action Value in Frontal-Striatal Circuits

Journal: :Neuron 2012

Moonsang Seo Eunjeong Lee Bruno B. Averbeck

The role that frontal-striatal circuits play in normal behavior remains unclear. Two of the leading hypotheses suggest that these circuits are important for action selection or reinforcement learning. To examine these hypotheses, we carried out an experiment in which monkeys had to select actions in two different task conditions. In the first (random) condition, actions were selected on the bas...

متن کامل

application of the sinc approximation to the solution of bratu's problem

Journal: :international journal of mathematical modelling and computations 0

j. rashidinia department of mathematics, islamic azad university,central tehran branch, iran iran, islamic republic of n. taher iran, islamic republic of

in this work, we study the performance of the sinc-collocation method for solving bratu's problem. for different choices of step size, we consider the maximum absolute errors in the solutions at sinc grid points and tabulated in tables. the comparison of the obtained results veri ed that this method converges to the exact solution rapidly and with

متن کامل

existence of triple positive solutions for boundary value problem of nonlinear fractional differential equations

Journal: :computational methods for differential equations 0

kamal shah university of malakand salman zeb department of mathematics university of malakand rahmat ali khan dean of science university of malakand

this article is devoted to the study of existence and multiplicity of positive solutions to aclass of nonlinear fractional order multi-point boundary value problems of the type−dq0+u(t) = f(t, u(t)), 1 < q ≤ 2, 0 < t < 1,u(0) = 0, u(1) =m−2∑ i=1δiu(ηi),where dq0+ represents standard riemann-liouville fractional derivative, δi, ηi ∈ (0, 1) withm−2∑i=1δiηi q−1 < 1, and f : [0, 1] × [0, ∞) → [0, ∞...

متن کامل

اموزش حقوق بشر در مدارس

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه پیام نور - دانشگاه پیام نور استان تهران - دانشکده حقوق 1389

مهدی نکویی, حسین شریفی طراز کوهی, کامران هاشمی,

abstract the third millennium has started, but the world is facing with serious challenges in achieving international security and peace. various human rights violations have lead the states to find means to protect human rights. also article 55 of the united nations charter introduces the respect to human rights and fundamental freedom as the most suitable ways to realize peace and security. ...