Correction: Temporal-Difference Reinforcement Learning with Distributed Representations
                    
                        
                            نویسندگان
                            
                            
                        
                        
                    
                    
                    چکیده
منابع مشابه
Temporal-Difference Reinforcement Learning with Distributed Representations
Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential d...
متن کاملDistributed relational temporal difference learning
Relational representations have great potential for rapidly generalizing learned knowledge in large Markov decision processes such as multi-agent problems. In this work, we introduce relational temporal difference learning for the distributed case where the communication links among agents are dynamic. Thus no critical components of the system should reside in any one agent. Relational generali...
متن کاملMultigrid Algorithms for Temporal Difference Reinforcement Learning
We introduce a class of Multigrid based temporal difference algorithms for reinforcement learning with linear function approximation. Multigrid methods are commonly used to accelerate convergence of iterative numerical computation algorithms. The proposed Multigrid-enhanced TD(λ) algorithms allows to accelerate the convergence of the basic TD(λ) algorithm while keeping essentially the same per-...
متن کاملBayesian Reinforcement Learning with Gaussian Process Temporal Difference Methods
Reinforcement Learning is a class of problems frequently encountered by both biological and artificial agents. An important algorithmic component of many Reinforcement Learning solution methods is the estimation of state or state-action values of a fixed policy controlling a Markov decision process (MDP), a task known as policy evaluation. We present a novel Bayesian approach to policy evaluati...
متن کاملBasis Function Adaptation in Temporal Difference Reinforcement Learning
We examine methods for on-line optimization of the basis function for temporal difference Reinforcement Learning algorithms. We concentrate on architectures with a linear parameterization of the value function. Our methods optimize the weights of the network while simultaneously adapting the parameters of the basis functions in order to decrease the Bellman approximation error. A gradient-based...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PLoS ONE
سال: 2009
ISSN: 1932-6203
DOI: 10.1371/annotation/4a24a185-3eff-454f-9061-af0bf22c83eb