On Partially Observed Stochastic Shortest Path Problems

نویسنده

  • Stephen D. Patek
چکیده

We analyze a class of partially observed stochastic shortest path problems. These are terminating Markov decision process with imperfect state information that evolve on an innnite time horizon and have a total cost criterion. For well-posedness, we make reasonable stochastic shortest path type assumptions: (1) the existence of a policy that guarantees termination with probability one and (2) the property that any policy that fails to guarantee termination has innnite expected cost from some initial state. We also assume that termination is perfectly recognized. We establish the existence of a stationary optimal policy along with the existence of a unique bounded solution to Bellman's equation. We also reveal the convergence properties of value and policy iteration. For the case where policies exist that do not guarantee termination, the dynamic programming operator fails to have an m-stage contraction property.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spreading paths in partially observed social networks

Understanding how and how far information, behaviors, or pathogens spread in social networks is an important problem, having implications for both predicting the size of epidemics, as well as for planning effective interventions. There are, however, two main challenges for inferring spreading paths in real-world networks. One is the practical difficulty of observing a dynamic process on a netwo...

متن کامل

New Grid-Based Algorithms for Partially Observable Markov Decision Processes: Theory and Practice

We present two new algorithms for Partially Observable Markov Decision Processes (pomdps). The first algorithm is a general grid-based algorithm for pomdps with theoretical optimality guarantees. The other algorithm is for the subclass of problems known as Stochastic Shortest-Path problems in belief space. Both algorithms are optimal and robust with respect to a novel robustness criterion that ...

متن کامل

Solving Stochastic Shortest-Path Problems with RTDP

We present a modification of the Real-Time Dynamic Programming (rtdp) algorithm that makes it a genuine off-line algorithm for solving Stochastic Shortest-Path problems. Also, a new domainindependent and admissible heuristic is presented for Stochastic Shortest-Path problems. The new algorithm and heuristic are compared with Value Iteration over benchmark problems with large state spaces. The r...

متن کامل

General Error Bounds in Heuristic Search Algorithms for Stochastic Shortest Path Problems

We consider recently-derived error bounds that can be used to bound the quality of solutions found by heuristic search algorithms for stochastic shortest path problems. In their original form, the bounds can only be used for problems with positive action costs. We show how to generalize the bounds so that they can be used in solving any stochastic shortest path problem, regardless of cost struc...

متن کامل

Using Stochastic-Dominance Relationships for Bounding Travel Times in Stochastic Networks

We consider stochastic networks in which link travel times are dependent, discrete random variables. We present methods for computing bounds on path travel times using stochastic dominance relationships among link travel times, and discuss techniques for controlling tightness of the bounds. We apply these methods to shortest-path problems, show that the proposed algorithm can provide bounds on ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999