Apprenticeship learning with few examples

نویسندگان

  • Abdeslam Boularias
  • Brahim Chaib-draa
چکیده

We consider the problem of imitation learning when the examples, provided by an expert human, are scarce. Apprenticeship Learning via Inverse Reinforcement Learning provides an efficient tool for generalizing the examples, based on the assumption that the expert’s policy maximizes a value function, which is a linear combination of state and action features. Most apprenticeship learning algorithms use only simple empirical averages of the features in the demonstrations as a statistics of the expert’s policy. However, this method is efficient only when the number of examples is sufficiently large to cover most of the states, or the dynamics of the system is nearly deterministic. In this article, we show that the quality of the learned policies is sensitive to the error in estimating the averages of the features when the dynamics of the system is stochastic. To reduce this error, we introduce two new approaches for bootstrapping the demonstrations by assuming that the expert is near-optimal and the dynamics of the system is known. In the first approach, the expert’s examples are used to learn a reward function and to generate furthermore examples from the corresponding optimal policy. The second approach uses a transfer technique, known as graph homomorphism, in order to generalize the expert’s actions to unvisited regions of the state space. Empirical results on simulated robot navigation problems show that our approach is able to learn sufficiently good policies from a significantly small number of examples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Apprenticeship Learning with Smart Humans

This report describes a generalized apprenticeship learning protocol for reinforcement-learning agents with access to a teacher. The teacher interacts with the agent by providing policy traces (transition and reward observations). We characterize sufficient conditions of the underlying models for efficient apprenticeship learning and link this criteria to two established learnability classes (K...

متن کامل

Training an Agent Through Demonstration: A Plausible Version Space Approach

This paper presents an efficient approach to training an agent to perform a complex task through demonstration, explanation and supervision. This approach is based on an integration of techniques of multistrategy and apprenticeship learning, knowledge elicitation and programming by demonstration, in a plausible version space framework, and is implemented in Agent-Disciple. Agent-Disciple addres...

متن کامل

Knowledge Base Refinement Using Apprenticeship Learning Techniques

This paper describes how apprenticeship learning techniques can be used to refine the knowledge base of an expert system for heuristic classification problems. The described method is an alternative to the long-standing practice of creating such knowledge bases via induction from examples. The form of apprenticeship learning discussed in this paper is a form of learning by watching, in which le...

متن کامل

Cognitive Apprenticeship in Educationalpractice: Research on Scaffolding,modeling, Mentoring, and Coachingas Instructional Strategies

Apprenticeship is an inherently social learning method with a long history of helping novices become experts in fields as diverse as midwifery, construction, and law. At the center of apprenticeship is the concept of more experienced people assisting less experienced ones, providing structure and examples to support the attainment of goals. Traditionally apprenticeship has been associated with ...

متن کامل

The Use of Apprenticeship Learning Via Inverse Reinforcement Learning for Generating Melodies

The research presented in this paper uses apprenticeship learning via inverse reinforcement learning to ascertain a reward function in a musical context. The learning agent then used this reward function to generate new melodies using reinforcement learning. Reinforcement learning is a type of unsupervised machine learning where rewards are used to guide an agent’s learning. These rewards are u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neurocomputing

دوره 104  شماره 

صفحات  -

تاریخ انتشار 2013