Expectation-Maximization for Inverse Reinforcement Learning with Hidden Data

نویسندگان

Kenneth D. Bogert

Jonathan Feng-Shun Lin

Prashant Doshi

Dana Kulic

چکیده

We consider the problem of performing inverse reinforcement learning when the trajectory of the agent being observed is partially occluded from view. Motivated by robotic scenarios in which limited sensor data is available to a learner, we treat the missing information as hidden variables and present an algorithm based on expectationmaximization to solve the non-linear, non-convex problem. Previous work in this area simply removed the occluded portions from consideration when computing feature expectations; in contrast our technique takes expectations over the missing values, enabling learning even in the presence of dynamic occlusion. We evaluate our new algorithm in a simulated reconnaissance scenario in which the visible portion of the state space varies. Finally, we show our approach enables apprenticeship learning by observing a human performing a sorting task in spite of key information missing from observations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inverse Reinforcement Learning Under Noisy Observations (Extended Abstract)

We consider the problem of performing inverse reinforcement learning when the trajectory of the expert is not perfectly observed by the learner. Instead, noisy observations of the trajectory are available. We generalize the previous method of expectation-maximization for inverse reinforcement learning, which allows the trajectory of the expert to be partially hidden from the learner, to incorpo...

متن کامل

Expectation Maximization for Weakly Labeled Data

We call data weakly labeled if it has no exact label but rather a numerical indication of correctness of the label “guessed” by the learning algorithm a situation commonly encountered in problems of reinforcement learning. The term emphasizes similarities of our approach to the known techniques of solving unsupervised and transductive problems. In this paper we present an on-line algorithm that...

متن کامل

Inverse Reinforcement Learning Under Noisy Observations

We consider the problem of performing inverse reinforcement learning when the trajectory of the expert is not perfectly observed by the learner. Instead, a noisy continuoustime observation of the trajectory is provided to the learner. This problem exhibits wide-ranging applications and the specific application we consider here is the scenario in which the learner seeks to penetrate a perimeter ...

متن کامل

Scaling Expectation-Maximization for Inverse Reinforcement Learning to Multiple Robots under Occlusion

We consider inverse reinforcement learning (IRL) when portions of the expert’s trajectory are occluded from the learner. For example, two experts performing tasks in close proximity may block each other from the learner’s view or the learner is a robot observing mobile robots from a fixed position with limited sensor range. Previous methods mitigate this challenge by either focusing on the obse...

متن کامل

Inverse Reinforcement Learning with Locally Consistent Reward Functions

Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated trajectory to be produced by only a single reward function. This paper presents a novel generalization of the IRL problem that allows each trajectory to be generated by multiple locally consistent reward functions, hence catering to more realistic and complex experts’ behaviors. Solving our generali...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Expectation-Maximization for Inverse Reinforcement Learning with Hidden Data

نویسندگان

چکیده

منابع مشابه

Inverse Reinforcement Learning Under Noisy Observations (Extended Abstract)

Expectation Maximization for Weakly Labeled Data

Inverse Reinforcement Learning Under Noisy Observations

Scaling Expectation-Maximization for Inverse Reinforcement Learning to Multiple Robots under Occlusion

Inverse Reinforcement Learning with Locally Consistent Reward Functions

عنوان ژورنال:

اشتراک گذاری