A Monte Carlo AIXI Approximation
نویسندگان
چکیده
This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.
منابع مشابه
Reinforcement Learning via AIXI Approximation
This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. This approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the a...
متن کاملA computational approximation to the AIXI model
Universal induction solves in principle the problem of choosing a prior to achieve optimal inductive inference. The AIXI theory, which combines control theory and universal induction, solves in principle the problem of optimal behavior of an intelligent agent. A practically most important and very challenging problem is to find a computationally efficient (if not optimal) approximation for the ...
متن کاملA Monte Carlo AIXI Approximation
We implemented the algorithm for learning and planning in partially observable Markov decision processes described in A Monte Carlo AIXI Approximation. Because this paper is highly focused on the theoretical aspect of the AIXI approximation, some details were omitted for ease of presentation. We used the following test domains from the paper to assess the performance of our replication, • 1d-Ma...
متن کاملA Monte Carlo Algorithm for Universally Optimal Bayesian Sequence Prediction and Planning
The aim of this work is to address the question of whether we can in principle design rational decision-making agents or artificial intelligences embedded in computable physics such that their decisions are optimal in reasonable mathematical senses. Recent developments in rare event probability estimation, recursive bayesian inference, neural networks, and probabilistic planning are sufficient ...
متن کاملSea Surfaces Scattering by Multi-Order Small-Slope Approximation: a Monte-Carlo and Analytical Comparison
L-band electromagnetic scattering from two-dimensional random rough sea surfaces are calculated by first- and second-order Small-Slope Approximation (SSA1, 2) methods. Both analytical and numerical computations are utilized to calculate incoherent normalized radar cross-section (NRCS) in mono- and bi-static cases. For evaluating inverse Fourier transform, inverse fast Fourier transform (IFFT) i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Artif. Intell. Res.
دوره 40 شماره
صفحات -
تاریخ انتشار 2011