Eligibility Traces

Expected Eligibility Traces

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem remains central research in reinforcement learning artificial intelligence. Eligibility traces enable efficient recent sequence experienced by agent, but not counterfactual sequences that could also have led current state. In this work, we introduce expected ...

متن کامل

Using Sliding Mode Controller and Eligibility Traces for Controlling the Blood Glucose in Diabetic Patients at the Presence of Fault

Journal: International Journal of Engineering 2020

A. Noori, M. A. Sadrnia, M. B. Naghibi Sistani,

Some people suffering from diabetes use insulin injection pumps to control the blood glucose level. Sometimes, the fault may occur in the sensor or actuator of these pumps. The main objective of this paper is controlling the blood glucose level at the desired level and fault-tolerant control of these injection pumps. To this end, the eligibility traces algorithm is combined with the sliding mod...

متن کامل

Reinforcement learning with replacing eligibility traces

Journal: :Machine Learning 1996

متن کامل

Bidding Strategy on Demand Side Using Eligibility Traces Algorithm

Journal: International Journal of Smart Electrical Engineering 2017

Amin Noori, Mahdi Besharatifar, Seyed Mohammad Ali Naseri Gavareshk, Somayeh Hasanpour Darban,

Restructuring in the power industry is followed by splitting different parts and creating a competition between purchasing and selling sections. As a consequence, through an active participation in the energy market, the service provider companies and large consumers create a context for overcoming the problems resulted from lack of demand side participation in the market. The most prominent ch...

متن کامل

Replacing eligibility trace for action-value learning with function approximation

2007

Kary Främling

The eligibility trace is one of the most used mechanisms to speed up reinforcement learning. Earlier reported experiments seem to indicate that replacing eligibility traces would perform better than accumulating eligibility traces. However, replacing traces are currently not applicable when using function approximation methods where states are not represented uniquely by binary values. This pap...

متن کامل

Investigating Recurrence and Eligibility Traces in Deep Q-Networks

Journal: :CoRR 2017

Jean Harb Doina Precup

Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowledge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highligh...

متن کامل

Macro - Actions in Reinforcement Learning : An EmpiricalAnalysisAmy McGovern and Richard

1998

Amy McGovern Richard S. Sutton

Several researchers have proposed reinforcement learning methods that obtain advantages in learning by using temporally extended actions, or macro-actions, but none has carefully analyzed what these advantages are. In this paper, we separate and analyze two advantages of using macro-actions in reinforcement learning: the eeect on exploratory behavior, independent of learning, and the eeect on t...

متن کامل

Distinct Eligibility Traces for LTP and LTD in Cortical Synapses

Journal: :Neuron 2015

Kaiwen He Marco Huertas Su Z. Hong XiaoXiu Tie Johannes W. Hell Harel Shouval Alfredo Kirkwood

In reward-based learning, synaptic modifications depend on a brief stimulus and a temporally delayed reward, which poses the question of how synaptic activity patterns associate with a delayed reward. A theoretical solution to this so-called distal reward problem has been the notion of activity-generated "synaptic eligibility traces," silent and transient synaptic tags that can be converted int...

متن کامل

Eligibility Traces for Off-Policy Policy Evaluation

2000

Doina Precup Richard S. Sutton Satinder P. Singh

Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference methods. Here we generalize eligibility traces to off-policy learning, in which one learns about a policy different from the policy that generates the data. Off-policy methods can greatly multiply learning, as many policie...

متن کامل

Experimental analysis of eligibility traces strategies in temporal difference learning

Journal: :IJKESDP 2009

Jinsong Leng Lakhmi C. Jain Colin Fyfe

Temporal difference (TD) learning is a model-free reinforcement learning technique, which adopts an infinite horizon discount model and uses an incremental learning technique for dynamic programming. The state value function is updated in terms of sample episodes. Utilising eligibility traces is a key mechanism in enhancing the rate of convergence. TD(λ) represents the use of eligibility traces...

متن کامل