نتایج جستجو برای: Eligibility Traces

تعداد نتایج: 37705  

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem remains central research in reinforcement learning artificial intelligence. Eligibility traces enable efficient recent sequence experienced by agent, but not counterfactual sequences that could also have led current state. In this work, we introduce expected ...

Some people suffering from diabetes use insulin injection pumps to control the blood glucose level. Sometimes, the fault may occur in the sensor or actuator of these pumps. The main objective of this paper is controlling the blood glucose level at the desired level and fault-tolerant control of these injection pumps. To this end, the eligibility traces algorithm is combined with the sliding mod...

Restructuring in the power industry is followed by splitting different parts and creating a competition between purchasing and selling sections. As a consequence, through an active participation in the energy market, the service provider companies and large consumers create a context for overcoming the problems resulted from lack of demand side participation in the market. The most prominent ch...

2007
Kary Främling

The eligibility trace is one of the most used mechanisms to speed up reinforcement learning. Earlier reported experiments seem to indicate that replacing eligibility traces would perform better than accumulating eligibility traces. However, replacing traces are currently not applicable when using function approximation methods where states are not represented uniquely by binary values. This pap...

Journal: :CoRR 2017
Jean Harb Doina Precup

Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowledge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highligh...

1998
Amy McGovern Richard S. Sutton

Several researchers have proposed reinforcement learning methods that obtain advantages in learning by using temporally extended actions, or macro-actions, but none has carefully analyzed what these advantages are. In this paper, we separate and analyze two advantages of using macro-actions in reinforcement learning: the eeect on exploratory behavior, independent of learning, and the eeect on t...

Journal: :Neuron 2015
Kaiwen He Marco Huertas Su Z. Hong XiaoXiu Tie Johannes W. Hell Harel Shouval Alfredo Kirkwood

In reward-based learning, synaptic modifications depend on a brief stimulus and a temporally delayed reward, which poses the question of how synaptic activity patterns associate with a delayed reward. A theoretical solution to this so-called distal reward problem has been the notion of activity-generated "synaptic eligibility traces," silent and transient synaptic tags that can be converted int...

2000
Doina Precup Richard S. Sutton Satinder P. Singh

Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference methods. Here we generalize eligibility traces to off-policy learning, in which one learns about a policy different from the policy that generates the data. Off-policy methods can greatly multiply learning, as many policie...

Journal: :IJKESDP 2009
Jinsong Leng Lakhmi C. Jain Colin Fyfe

Temporal difference (TD) learning is a model-free reinforcement learning technique, which adopts an infinite horizon discount model and uses an incremental learning technique for dynamic programming. The state value function is updated in terms of sample episodes. Utilising eligibility traces is a key mechanism in enhancing the rate of convergence. TD(λ) represents the use of eligibility traces...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید