نتایج جستجو برای: discrete action reinforcement learning automata darla
تعداد نتایج: 1357117 فیلتر نتایج به سال:
A stochastic automaton can perform a finite number of actions in a random environment. When a specific action is performed, the environment responds by producing an environment output that is stochastically related to the action. This response may be favorable or unfavorable. The aim is to design an automaton that can determine the best action guided by past actions and responses. The reinforce...
We propose a principle on how a computational agent can learn the structure of a classic discrete state space. The idea is to do a kind of principal component analysis on a matrix describing transitions from one state to another. This transforms the space of discrete, completely separate, states into a dimensional representation in a Euclidean space. The representation supports action selection...
A Learning Automaton is a learning entity that learns the optimal action to use from its set of possible actions. It does this by performing actions toward an environment and analyzes the resulting response. The response, being both good and bad, results in behaviour change to the automaton (the automaton will learn based on this response). This behaviour change is often called reinforcement al...
Population coding is widely regarded as a key mechanism for achieving reliable behavioral decisions. We previously introduced reinforcement learning for population-based decision making by spiking neurons. Here we generalize population reinforcement learning to spike-based plasticity rules that take account of the postsynaptic neural code. We consider spike/no-spike, spike count and spike laten...
In this paper we propose surrogate agent-environment interface (SAEI) in reinforcement learning. We also state that learning based on probability surrogate agent-environment interface gives optimal policy of task agent-environment interface. We introduce surrogate probability action and develope the probability surrogate action deterministic policy gradient (PSADPG) algorithm based on SAEI. Thi...
In this paper, a new evolutionary computing model, called CLA-EC, is proposed. This model is a combination of a model called cellular learning automata (CLA) and the evolutionary model. In this model, every genome in the population is assigned to one cell of CLA and each cell in CLA is equipped with a set of learning automata. Actions selected by learning automata of a cell determine the genome...
Recent work has shown that deep neural networks are capable of approximating both value functions and policies in reinforcement learning domains featuring continuous state and action spaces. However, to the best of our knowledge no previous work has succeeded at using deep neural networks in structured (parameterized) continuous action spaces. To fill this gap, this paper focuses on learning wi...
Recent work has shown that deep neural networks are capable of approximating both value functions and policies in reinforcement learning domains featuring continuous state and action spaces. However, to the best of our knowledge no previous work has succeeded at using deep neural networks in structured (parameterized) continuous action spaces. To fill this gap, this paper focuses on learning wi...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید