A Monte Carlo AIXI Approximation

نویسندگان

Joel Veness

Kee Siong Ng

Marcus Hutter

David Silver

چکیده

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning via AIXI Approximation

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. This approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the a...

متن کامل

A computational approximation to the AIXI model

Universal induction solves in principle the problem of choosing a prior to achieve optimal inductive inference. The AIXI theory, which combines control theory and universal induction, solves in principle the problem of optimal behavior of an intelligent agent. A practically most important and very challenging problem is to find a computationally efficient (if not optimal) approximation for the ...

متن کامل

A Monte Carlo AIXI Approximation

We implemented the algorithm for learning and planning in partially observable Markov decision processes described in A Monte Carlo AIXI Approximation. Because this paper is highly focused on the theoretical aspect of the AIXI approximation, some details were omitted for ease of presentation. We used the following test domains from the paper to assess the performance of our replication, • 1d-Ma...

متن کامل

A Monte Carlo Algorithm for Universally Optimal Bayesian Sequence Prediction and Planning

The aim of this work is to address the question of whether we can in principle design rational decision-making agents or artificial intelligences embedded in computable physics such that their decisions are optimal in reasonable mathematical senses. Recent developments in rare event probability estimation, recursive bayesian inference, neural networks, and probabilistic planning are sufficient ...

متن کامل

Sea Surfaces Scattering by Multi-Order Small-Slope Approximation: a Monte-Carlo and Analytical Comparison

L-band electromagnetic scattering from two-dimensional random rough sea surfaces are calculated by first- and second-order Small-Slope Approximation (SSA1, 2) methods. Both analytical and numerical computations are utilized to calculate incoherent normalized radar cross-section (NRCS) in mono- and bi-static cases. For evaluating inverse Fourier transform, inverse fast Fourier transform (IFFT) i...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

J. Artif. Intell. Res.

دوره 40 شماره

صفحات -

تاریخ انتشار 2011

A Monte Carlo AIXI Approximation

نویسندگان

چکیده

منابع مشابه

Reinforcement Learning via AIXI Approximation

A computational approximation to the AIXI model

A Monte Carlo AIXI Approximation

A Monte Carlo Algorithm for Universally Optimal Bayesian Sequence Prediction and Planning

Sea Surfaces Scattering by Multi-Order Small-Slope Approximation: a Monte-Carlo and Analytical Comparison

عنوان ژورنال:

اشتراک گذاری