نتایج جستجو برای: discrete action reinforcement learning automata darla

تعداد نتایج: 1357117  

2010
Bikramjit Banerjee Landon Kraemer

Journal: :journal of artificial intelligence in electrical engineering 2014
mohammad esmaeil akbari noradin ghadimi

in this paper an adaptive pid controller for wind energy conversion systems (wecs) has been developed. theadaptation technique applied to this controller is based on reinforcement learning (rl) theory. nonlinearcharacteristics of wind variations as plant input, wind turbine structure and generator operational behaviordemand for high quality adaptive controller to ensure both robust stability an...

2007
H. Montazeri

In this paper, a new model, addressing the Associative Reinforcement Learning (ARL) problem, based on learning automata and self organizing map is proposed. The model consists of two layers. The First layer comprised of a SOM which is utilized to quantize the state (context) space and the second layer contains of a team of learning automata which is used to select an optimal action in each stat...

2007
Yann-Michaël De Hauwere Katja Verbeeck Maarten Peeters Ann Nowé

In this paper we tested the practical use of an existing learning algorithm, called Exploring Selfish Reinforcement Learning (ESRL), on conflicting interest games. This algorithm is based on the principles of learning automata and, for games where agents have conflicting interests, on a Homo egualis society. Furthermore we propose some variations on the exploration heuristic of the algorithm an...

2007
R. Tobi Bram Bakker

This paper investigates the potential of flat and hierarchical reinforcement learning (HRL) for solving problems within strategy games. A HRL method, Max-Q, is applied to a unit transportation task modelled within a simplified, discrete real-time strategy game engine, and its performance compared to that of flat Q-learning. It is shown that reinforcement learning approaches, and especially hier...

Journal: :Psychological bulletin 2014
Matthew M Walsh John R Anderson

To behave adaptively, we must learn from the consequences of our actions. Doing so is difficult when the consequences of an action follow a delay. This introduces the problem of temporal credit assignment. When feedback follows a sequence of decisions, how should the individual assign credit to the intermediate actions that comprise the sequence? Research in reinforcement learning provides 2 ge...

Journal: :the modares journal of electrical engineering 2006
mohammadreza meybodi farhad mehdipour

in this paper an application of cellular learning automata (cla) to vlsi placement is presented. the cla, which is introduced for the first time in this paper, is different from standard cellular learning automata in two respects. it has input and the cell neighborhood varies during the operation of cla. the proposed cla based algorithm for vlsi placement is tested on number of placement proble...

Supervisory control and fault diagnosis of hybrid systems need to have complete information about the discrete states transitions of the underling system. From this point of view, the hybrid system should be abstracted to a Discrete Trace Transition System (DTTS) and represented by a discrete mode transition graph. In this paper an effective method is proposed for generating discrete mode trans...

1996
Rémi Munos

This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcement Learning, in the continuous case, i.e. continuous state-space and time. The evaluation of the value function enables the generation of an optimal policy for reinforcement control problems, such as target or obstacle problems, viability problems or optimization problems. We propose a continuous for...

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید