reinforcement learning

Hierarchical Reinforcement Learning Based Self-balancing Algorithm for Two-wheeled Robots

2016

Juan Yan Huibin Yang

Abstract: Self-balancing control is the basis for applications of two-wheeled robots. In order to improve the self-balancing of twowheeled robots, we propose a hierarchical reinforcement learning algorithm for controlling the balance of two-wheeled robots. After describing the subgoals of hierarchical reinforcement learning, we extract features for subgoals, define a feature value vector and it...

متن کامل

Performance of distributed multi-agent multi-state reinforcement spectrum management using different exploration schemes

Journal: :Expert Syst. Appl. 2013

Albert Hung-Ren Ko Robert Sabourin François Gagnon

0957-4174/$ see front matter 2013 Elsevier Ltd. A http://dx.doi.org/10.1016/j.eswa.2013.01.035 ⇑ Corresponding author. Tel.: +1 514 577 9759. E-mail addresses: [email protected] (A.H.R. K (R. Sabourin), [email protected] (F. Gagnon). This paper introduces a novel multi-agent multi-state reinforcement learning exploration scheme for dynamic spectrum access and dynamic spectrum sharing ...

متن کامل

Path-Tracking Control of a Non-Holonomic Car-Like Robot with Reinforcement Learning

1999

Jacky Baltes Yuming Lin

The problem investigated in this paper is that of driving a car-like robot along a race track and the use of reinforcement learning to find a good control function. The reinforcement learner uses a case-based function approximator to extend the reinforcement learning paradigm to handle continuous states. The learned controller performs similar to the best control functions in both simulation an...

متن کامل

Learning Pessimism for Reinforcement Learning

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2023

Off-policy deep reinforcement learning algorithms commonly compensate for overestimation bias during temporal-difference by utilizing pessimistic estimates of the expected target returns. In this work, we propose Generalized Pessimism Learning (GPL), a strategy employing novel learnable penalty to enact such pessimism. particular, learn alongside critic with dual TD-learning, new procedure esti...

متن کامل

Schemes for learning and behaviour : a new expectancy model

1997

Christopher Mark Witkowski

This thesis presents a novel form of learning by reinforcement. Existing reinforcement learning algorithms rely on the provision of external reward signals to drive the learning algorithm. This new algorithm relies on reinforcing signals generated internally within the algorithm. The algorithm, SRS/E, described here generates expectancies ( -hypotheses), each of which gives rise to a specific p...

متن کامل

An ART-based fuzzy adaptive learning control network

Journal: :IEEE transactions on neural networks 1996

Cheng-Jian Lin Chin-Teng Lin

This paper proposes a reinforcement fuzzy adaptive learning control network (RFALCON), constructed by integrating two fuzzy adaptive learning control networks (FALCON), each of which has a feedforward multilayer network and is developed for the realization of a fuzzy controller. One FALCON performs as a critic network (fuzzy predictor), the other as an action network (fuzzy controller). Using t...

متن کامل

Reinforcement Learning with Policy Constraints

2007

Sebastian Thrun Jamieson E. Schulte

This paper addresses the problem of knowledge transfer in lifelong reinforcement learning. It proposes an algorithm which learns policy constraints, i.e., rules that characterize action selection in entire families of reinforcement learning tasks. Once learned, policy constraints are used to bias learning in future, similar reinforcement learning tasks. The appropriateness of the algorithm is d...

متن کامل

Learning to be a Bot: Reinforcement Learning in Shooter Games

2008

Michelle McPartland Marcus Gallagher

This paper demonstrates the applicability of reinforcement learning for first person shooter bot artificial intelligence. Reinforcement learning is a machine learning technique where an agent learns a problem through interaction with the environment. The Sarsa( ) algorithm will be applied to a first person shooter bot controller to learn the tasks of (1) navigation and item collection, and (2) ...

متن کامل

Dynamic Control Algorithm for Biped Walking Based on Policy Gradient Fuzzy Reinforcement Learning

2008

Duško M. Katić Aleksandar D. Rodić Mihailo Pupin

This paper presents a novel dynamic control approach to acquire biped walking of humanoid robots focussed on policy gradient reinforcement learning with fuzzy evaluative feedback . The proposed structure of controller involves two feedback loops: conventional computed torque controller including impact-force controller and reinforcement learning computed torque controller. Reinforcement learnin...

متن کامل

Reinforcement learning under circumstances beyond its control

2003

Chris Gaskett

Decision theory addresses the task of choosing an action; it provides robust decision-making criteria that support decision-making under conditions of uncertainty or risk. Decision theory has been applied to produce reinforcement learning algorithms that manage uncertainty in state-transitions. However, performance when there is uncertainty regarding the selection of future actions must also be...

متن کامل