نتایج جستجو برای: action value function

تعداد نتایج: 2342819  

Journal: Desert 2012
A. Salehpour Jam M. Karimpour Reihan M.K. Kianian S. Feiznia

Investigation of desertification trend needs understanding of phenomena creating changes singly or action and reaction together in the manner that these changes were ended up in land degradation. In investigation of pedological criterion onland degradation in Quaternary rock units, first, a part of the Rude-Shoor watershed area was selected. After distinguishing target area, maps of slope class...

Journal: :SIAM J. Control and Optimization 2015
William M. McEneaney Peter M. Dower

Two-point boundary value problems for conservative systems are studied in the context of the least action principle. One obtains a fundamental solution, whereby two-point boundary value problems are converted to initial value problems via an idempotent convolution of the fundamental solution with a cost function related to the terminal data. The classical mass-spring problem is included as a si...

2001
Szilveszter Kovács

Reinforcement learning methods, surviving the control difficulties of the unknown environment, are gaining more and more popularity recently in the autonomous robotics community. One of the possible difficulties of the reinforcement learning applications in complex situations is the huge size of the statevalueor action-value-function representation [2]. The case of continuous environment (conti...

2003
Hamid Beigy Mohammad Reza Meybodi

In this paper, we study an adaptive random search method based on continuous action-set learning automaton for solving stochastic optimization problems in which only the noisecorrupted value of function at any chosen point in the parameter space is available. We first introduce a new continuous action-set learning automaton (CALA) and study its convergence properties. Then we give an algorithm ...

Journal: :J. Economic Theory 2015
Jan Eeckhout Xi Weng

In many economic environments, agents often continue to learn about the same underlying state variable, even if they switch action. For example, a worker’s ability revealed in one job is informative about her productivity in another job. We analyze a general setup of experimentation with common values, and show that the value of experimentation must be equal whenever the agent switches action. ...

2013
Fiery Cushman

Dual-system approaches to psychology explain fundamental properties of human judgment, decision-making and behavior across a diverse domains. Yet, the appropriate characterization of each system is a source of debate. For instance, a large body of research on moral psychology makes use of the contrast between “emotional” and “rational/cognitive” processes—a distinction that is widely used outsi...

2011
Farnaz Abtahi Ian R. Fasel

We describe a continuous state/action reinforcement learning method which uses deep belief networks (DBNs) in conjunction with a value function-based reinforcement learning algorithm to learn effective control policies. Our approach is to first learn a model of the state-action space from data in an unsupervised pretraining phase, and then use neural-fitted Q-iteration (NFQ) to learn an accurat...

Journal: :Korean Journal of Mathematics 2015

Journal: :Brain : a journal of neurology 2015
Khoi Vo Robb B Rutledge Anjan Chatterjee Joseph W Kable

Sir, In their letter concerning our recent report, Drs Zeighami and Moustafa discuss several previous studies investigating the functions of the ventral and dorsal striatum and the dissociation between action-value and stimulus-value learning. They note that in light of much of this previous work, our findings regarding Patient XG—who suffered bilateral lesions to dorsal striatum and is impaire...

2009
Scott Proper

1. THREE CURSES OF DIMENSIONALITY Markov Decision Processes (MDPs) have proved to be useful and general models of optimal decision-making in uncertain domains. However, approaches to solving MDP’s using reinforcement learning that depend on storing the optimal value function and action models as tables do not scale to large state-spaces. Three computational obstacles prevent the use of standard...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید