نتایج جستجو برای: discrete action reinforcement learning automata darla

تعداد نتایج: 1357117  

2017
Georgios C. Chasparis

This paper considers a class of discrete-time reinforcement-learning dynamics and provides a stochasticstability analysis in repeatedly played positive-utility (strategicform) games. For this class of dynamics, convergence to pure Nash equilibria has been demonstrated only for the fine class of potential games. Prior work primarily provides convergence properties through stochastic approximatio...

2007
Peter Vrancx Katja Verbeeck Ann Nowé

Learning Automata (LA) were recently shown to be valuable tools for designing Multi-Agent Reinforcement Learning algorithms. One of the principal contributions of LA theory is that a set of decentralized, independent learning automata is able to control a finite Markov Chain with unknown transition probabilities and rewards. We extend this result to the framework of Multi-Agent MDP’s, a straigh...

2012
António Gusmão Tapani Raiko

We consider the problem of effective and automated decisionmaking in modern real-time strategy (RTS) games through the use of reinforcement learning techniques. RTS games constitute environments with large, high-dimensional and continuous state and action spaces with temporally-extended actions. To operate under such environments we propose Exlos, a stable, model-based MonteCarlo method. Contra...

2016
Kohei Arai

Pursuit Reinforcement guided Competitive Learning: PRCL based on relatively fast online clustering that allows grouping the data in concern into several clusters when the number of data and distribution of data are varied of reinforcement guided competitive learning is proposed. One of applications of the proposed method is image portion retrievals from the relatively large scale of the images ...

2012
Mohamad Faizal Bin Samsudin Yoshito Sawatsubashi Katsunari Shibata

For developing a robot that learns long and complicated action sequences act in the real-world, autonomous learning of multi-step discrete state transition is significant. It is generally thought to be difficult to achieve both holding and transition of states through learning in a recurrent neural network. In this paper, only through the reinforcement learning using rewards and punishments in ...

Journal: :journal of computer and robotics 0
samaneh assar faculty of computer and information technology engineering, qazvin branch, islamic azad university, qazvin, iran behrooz masoumi faculty of computer and information technology engineering, qazvin branch, islamic azad university, qazvin, iran

multi agent markov decision processes (mmdps), as the generalization of markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for multi agent reinforcement learning. in this paper, a generalized learning automata based algorithm for finding optimal policies in mmdp is proposed. in the proposed algorithm, mmdp ...

Journal: :Neural Computing and Applications 2021

Neural networks are effective function approximators, but hard to train in the reinforcement learning (RL) context mainly because samples correlated. In complex problems, a neural RL approach is often able learn better solution than tabular RL, generally takes longer. This paper proposes two methods, Discrete-to-Deep Supervised Policy Learning (D2D-SPL) and Q-value (D2D-SQL), whose objective ac...

2007
Carlos H. C. Ribeiro

In the last few years, reinforcement learning algorithms have been proposed as a more natural way of modelling animal learning. Unlike supervised learning methods, reinforcement learning addresses the basic problem faced by an animal when trying to control a discrete stochastic dynamic system: discover by trial and error a policy of actions that maximises some criterium of optimality, usually e...

Journal: :CoRR 2017
Hangyu Mao Zhibo Gong Yan Ni Xiangyu Liu Quanbin Wang Weichen Ke Chao Ma Yiping Song Zhen Xiao

Communication is a critical factor for the big multi-agent world to stay organized and productive. Typically, most multi-agent “learning-to-communicate” studies try to predefine the communication protocols or use technologies such as tabular reinforcement learning and evolutionary algorithm, which can not generalize to changing environment or large collection of agents. In this paper, we propos...

Journal: :International Journal of Advanced Research in Artificial Intelligence 2016

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید