Loss-Calibrated Monte Carlo Action Selection

نویسندگان

  • Ehsan Abbasnejad
  • Justin Domke
  • Scott Sanner
چکیده

Bayesian decision-theory underpins robust decisionmaking in applications ranging from plant control to robotics where hedging action selection against state uncertainty is critical for minimizing low probability but potentially catastrophic outcomes (e.g, uncontrollable plant conditions or robots falling into stairwells). Unfortunately, belief state distributions in such settings are often complex and/or high dimensional, thus prohibiting the efficient application of analytical techniques for expected utility computation when real-time control is required. This leaves Monte Carlo evaluation as one of the few viable (and hence frequently used) techniques for online action selection. However, loss-insensitive Monte Carlo methods may require large numbers of samples to identify optimal actions with high certainty since they may sample from high probability regions that do not disambiguate action utilities. In this paper we remedy this problem by deriving an optimal proposal distribution for a loss-calibrated Monte Carlo importance sampler that bounds the regret of using an estimated optimal action. Empirically, we show that using our loss-calibrated Monte Carlo method yields high-accuracy optimal action selections in a fraction of the number of samples required by conventional loss-insensitive samplers. Introduction Bayesian decision-theory (Gelman et al. 1995; Robert 2001; Berger 2010) provides a formalization of robust decisionmaking in uncertain settings by maximizing expected utility. Formally, a utility function u(θ, a) quantifies the return of performing an action a ∈ A = {a1, . . . , ak} in a given state θ. When the true state is uncertain and only a belief state distribution p(θ) is known, Bayesian decision-theory posits that an optimal control action a should maximize the expected utility (EU) U(a) = E[u(θ, a)] = ˆ u(θ, a)p(θ)dθ. (1) where by definition, the optimal action a∗ is a∗ = arg max a U(a). (2) In real-world settings such as robotics (Thrun 2000), the Copyright c © 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. a1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Loss of Load Expectation Assessment in Deregulated Power Systems Using Monte Carlo Simulation and Intelligent Systems

Deregulation policy has caused some changes in the concepts of power systems reliability assessment and enhancement. In this paper, generation reliability is considered, and a method for its assessment using intelligent systems is proposed. Also, because of power market and generators’ forced outages stochastic behavior, Monte Carlo Simulation is used for reliability evaluation. Generation r...

متن کامل

A Markov Chain Monte Carlo Cellular Automata Model to Simulate Urban Growth

This paper investigates the potential of a cellular automata (CA) model based on logistic regression (logit) and Markov Chain Monte Carlo (MCMC) to simulate the dynamics of urban growth. The model assesses urbanization likelihood based on (i) a set of urban development driving forces (calibrated based on logit) and (ii) the land-use of neighboring cells (calibrated based on MCMC). An innovative...

متن کامل

Bayesian Bandwidth Selection in Nonparametric Time - Varying Coefficient Models

Bandwidth plays an important role in determining the performance of local linear estimators. In this paper, we propose a Bayesian approach to bandwidth selection for local linear estimation of time–varying coefficient time series models, where the errors are assumed to follow the Gaussian kernel error density. A Markov chain Monte Carlo algorithm is presented to simultaneously estimate the band...

متن کامل

Evolutionary Monte Carlo Methods for Clustering

The problem of clustering a group of observations according to some objective function (e.g., K -means clustering, variable selection) or a density (e.g., posterior from a Dirichlet process mixture model prior) can be cast in the framework of Monte Carlo sampling for cluster indicators. We propose a new method called the evolutionary Monte Carlo clustering (EMCC) algorithm, in which three new “...

متن کامل

Dosimetric analysis for the selection of radionuclides in bone pain palliation targeted therapy: A Monte Carlo simulation

Introduction:The use of beta emitters is one of the effective methods for palliation of bone metastasis. The risk of normal tissue toxicity should be evaluated in the bone pain palliation treatment. Methods: In this study, the Monte Carlo simulation code MCNPX was used for simulation a bone phantom model consisted of bone marrow, bone and soft tissue. Spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015