Temporal Difference Learning

نتایج جستجو برای: Temporal Difference Learning

تعداد نتایج: 1222164 فیلتر نتایج به سال:

Temporal difference learning

Journal: :Scholarpedia 2007

متن کامل

Transfer of Knowledge Structures with Relational Temporal Difference Learning

2006

David Stracuzzi Nima Asgharbeygi

The ability to transfer knowledge from one domain to another is an important aspect of learning. Knowledge transfer increases learning efficiency by freeing the learner from duplicating past efforts. In this paper, we demonstrate how reinforcement learning agents can use relational representations to transfer knowledge across related domains.

متن کامل

Hippocampal Model of Rat Spatial Abilities Using Temporal Difference Learning

1997

David J. Foster Richard G. M. Morris Peter Dayan

Peter Dayan E25-210, MIT Cambridge, MA 02139 We provide a model of the standard watermaze task, and of a more challenging task involving novel platform locations, in which rats exhibit one-trial learning after a few days of training. The model uses hippocampal place cells to support reinforcement learning, and also, in an integrated manner, to build and use allocentric coordinates.

متن کامل

A Definition of Happiness for Reinforcement Learning Agents

2015

Mayank Daswani Jan Leike

What is happiness for reinforcement learning agents? We seek a formal definition satisfying a list of desiderata. Our proposed definition of happiness is the temporal difference error, i.e. the difference between the value of the obtained reward and observation and the agent’s expectation of this value. This definition satisfies most of our desiderata and is compatible with empirical research o...

متن کامل

Smart exploration in reinforcement learning using absolute temporal difference errors

2013

Clement Gehring Doina Precup

Exploration is still one of the crucial problems in reinforcement learning, especially for agents acting in safety-critical situations. We propose a new directed exploration method, based on a notion of state controlability. Intuitively, if an agent wants to stay safe, it should seek out states where the effects of its actions are easier to predict; we call such states more controllable. Our ma...

متن کامل

Models of Hippocampally Dependent Navigation, Using The Temporal Difference Learning Rule

2000

D. J. Foster R. G. M. Morris Peter Dayan

متن کامل

Differential Temporal Difference Learning

Journal: :IEEE Transactions on Automatic Control 2021

Value functions derived from Markov decision processes arise as a central component of algorithms well performance metrics in many statistics and engineering applications machine learning. Computation the solution to associated Bellman equations is challenging most practical cases interest. A popular class approximation techniques, known temporal difference (TD) learning algorithms, are an impo...

متن کامل

Learning a Game Strategy Using Pattern-Weights and Self-play

2002

Ari Shapiro Gil Fuchs Robert Levinson

This paper demonstrates the use of pattern-weights in order to develop a strategy for an automated player of a non-cooperative version of the game of Diplomacy. Diplomacy is a multi-player, zerosum and simultaneous move game with imperfect information. Patternweights represent stored knowledge of various aspects of a game that are learned through experience. An automated computer player is deve...

متن کامل

Application of temporal difference learning to the game of Snake

2017

Christopher Lockhart

APPLICATION OF TEMPORAL DIFFERENCE LEARNING TO THE GAME OF SNAKE Christopher Lockhart

متن کامل

Temporal Difference Learning: A Critique

2002

ESWAR SIVARAMAN Martin T. Hagan Eswar Sivaraman

Submittedinpartialfulfillmentof thecourserequirementsfor " NeuralNetworks " ECEN5733 May2000

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید