partially non

Learning policies for partially observable environments : Scaling upMichael

1995

Michael L. Littman Anthony R. Cassandra Leslie Pack Kaelbling

Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the study of pomdp's is motivated by a need to address realistic problems , existing techniques for nding optimal behavior do not appear to scale well and have been unable to nd satisfactory policies for proble...

متن کامل

History-Based Controller Design and Optimization for Partially Observable MDPs

2015

Akshat Kumar Shlomo Zilberstein

Partially observable MDPs provide an elegant framework for sequential decision making. Finite-state controllers (FSCs) are often used to represent policies for infinite-horizon problems as they offer a compact representation, simple-toexecute plans, and adjustable tradeoff between computational complexity and policy size. We develop novel connections between optimizing FSCs for POMDPs and the d...

متن کامل

A Hierarchy of Equivalence Relations for Partially Observable Markov Decision Processes

2007

Monica Dinculescu

We discuss the problem of comparing the behavioural equivalence of partially observable systems with observations. We examine different types of equivalence relations on states, and show that branching equivalence relations are stronger than linear ones. Finally, we discuss how this hierarchy can be used in duality theory.

متن کامل

A reinforcement learning scheme for a multi-agent card game

2003

Hajime Fujita Yoichiro Matsuno Shin Ishii

We formulate an automatic strategy acquisition problem for the multi-agent card game “Hearts” as a reinforcement learning (RL) problem. Since there are often a lot of unobservable cards in this game, RL is approximately dealt with in the framework of a partially observable Markov decision process (POMDP). This article presents a POMDP-RL method based on estimation of unobservable state variable...

متن کامل

Region-Based Incremental Pruning for POMDPs

2004

Zhengzhu Feng Shlomo Zilberstein

We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dynamic programming (DP) update, a key source of complexity in POMDP algorithms. Instead of reasoning about the whole belief space when pruning the cross-sums, our algorithm divides the belief space into smaller regions a...

متن کامل

RAPID: A Reachable Anytime Planner for Imprecisely-sensed Domains

2010

Emma Brunskill Stuart J. Russell

Despite the intractability of generic optimal partially observable Markov decision process planning, there exist important problems that have highly structured models. Previous researchers have used this insight to construct more efficient algorithms for factored domains, and for domains with topological structure in the flat state dynamics model. In our work, motivated by findings from the edu...

متن کامل

Speeding up Online POMDP Planning - Unification of Observation Branches by Belief-state Compression Via Expected Feature Values

2015

Gavin Rens

A novel algorithm to speed up online planning in partially observable Markov decision processes (POMDPs) is introduced. I propose a method for compressing nodes in beliefdecision-trees while planning occurs. Whereas belief-decision-trees branch on actions and observations, with my method, they branch only on actions. This is achieved by unifying the branches required due to the nondeterminism o...

متن کامل

Dynamic Decision Making in Stochastic Partially Observable Medical Domains: Ischemic Heart Disease Example

2004

Milos Hauskrecht

The focus of this paper is the framework of partially observable Markov decision processes (POMDPs) and its role in modeling and solving complex dynamic decision problems in stochastic and partially observable medical domains. The paper summarizes some of the basic features of the POMDP framework and explores its potential in solving the problem of the management of the patient with chronic isc...

متن کامل

Scalable POMDPs for Diagnosis and Planning in Intelligent Tutoring Systems

2010

Jeremiah T. Folsom-Kovarik Gita Reese Sukthankar Sae Lynne Schatz Denise M. Nicholson

A promising application area for proactive assistant agents is automated tutoring and training. Intelligent tutoring systems (ITSs) assist tutors and tutees by automating diagnosis and adaptive tutoring. These tasks are well modeled by a partially observable Markov decision process (POMDP) since it accounts for the uncertainty inherent in diagnosis. However, an important aspect of making POMDP ...

متن کامل

An Improved Grid-Based Approximation Algorithm for POMDPs

2001

Rong Zhou Eric A. Hansen

Although a partially observable Markov decision process (POMDP) provides an appealing model for problems of planning under uncertainty, exact algorithms for POMDPs are intractable. This motivates work on approximation algorithms, and grid-based approximation is a widely-used approach. We describe a novel approach to grid-based approximation that uses a variable-resolution regular grid, and show...

متن کامل