A Method for Speeding Up Value Iteration in Partially Observable Markov Decision Processes

نویسندگان

  • Nevin Lianwen Zhang
  • Stephen S. Lee
  • Weihong Zhang
چکیده

We present a technique for speeding up the convergence of value iteration for par­ tially observable Markov decisions processes (POMDPs). The underlying idea is similar to that behind modified policy iteration for fully observable Markov decision processes (MDPs). The technique can be easily incor­ porated into any existing POMDP value it­ eration algorithms. Experiments have been conducted on several test problems with one POMDP value iteration algorithm called in­ cremental pruning. We find that the tech­ nique can make incremental pruning run sev­ eral orders of magnitude faster.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for nding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value ite...

متن کامل

Speeding Up the Convergence of Value Iterationin Partially Observable Markov Decision

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for nding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value ite...

متن کامل

Propagating Uncertainty in POMDP Value Iteration with Gaussian Processes

In this paper, we describe the general approach of trying to solve Partially Observable Markov Decision Processes with approximate value iteration. Methods based on this approach have shown promise for tackling larger problems where exact methods are doomed, but we explain how most of them suffer from the fundamental problem of ignoring information about the uncertainty of their estimates. We t...

متن کامل

Solving Informative Partially Observable Markov Decision Processes

Solving Partially Observable Markov Decision Processes (POMDPs) generally is computationally intractable. In this paper, we study a special POMDP class, namely informative POMDPs, where each observation provides good albeit incomplete information about world states. We propose two ways to accelerate value iteration algorithm for such POMDPs. First, dynamic programming (DP) updates can be carrie...

متن کامل

Speeding up Online POMDP Planning - Unification of Observation Branches by Belief-state Compression Via Expected Feature Values

A novel algorithm to speed up online planning in partially observable Markov decision processes (POMDPs) is introduced. I propose a method for compressing nodes in beliefdecision-trees while planning occurs. Whereas belief-decision-trees branch on actions and observations, with my method, they branch only on actions. This is achieved by unifying the branches required due to the nondeterminism o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999