limitedunlimited partially gated

Producing efficient error-bounded solutions for transition independent decentralized mdps

2013

Jilles Steeve Dibangoye Christopher Amato Arnaud Doniec François Charpillet

There has been substantial progress on algorithms for single-agent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds and fast convergence rates. Despite significant efforts, no algorithms for solving decentralized POMDPs benefit from these properties, leading ...

متن کامل

Dynamic Programming for Partially Observable Stochastic Games

2004

Eric A. Hansen Daniel S. Bernstein Shlomo Zilberstein

We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterative elimination of dominated strategies in normal form games. We prove that it iteratively eliminates very weakly dominated strategies without first forming the normal form r...

متن کامل

A Decision-Theoretic Approach to Task Assistance for Persons with Dementia

2005

Jennifer Boger Pascal Poupart Jesse Hoey Craig Boutilier Geoff Fernie Alex Mihailidis

Cognitive assistive technologies that aid people with dementia (such as Alzheimer’s disease) hold the promise to provide such people with an increased level of independence. However, to realize this promise, such systems must account for the specific needs and preferences of individuals. We argue that this form of customization requires a sequential, decision-theoretic model of interaction. We ...

متن کامل

On the Average Cost Optimality Equation and the Structure of Optimal Policies for Partially Observable Markov Decision Processes*

2001

Emmanuel Fernández-Gaucherand Aristotle Arapostathis Steven I. Marcus

We consider partially observable Markov decision processes with finite or countably infinite (core) state and observation spaces and finite action set. Following a standard approach, an equivalent completely observed problem is formulated, with the same finite action set but with an uncountable state space, namely the space of probability distributions on the original core state space. By devel...

متن کامل

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

Journal: :Journal of Machine Learning Research 2011

Stéphane Ross Joelle Pineau Brahim Chaib-draa Pierre Kreitmann

Bayesian learning methods have recently been shown to provide an elegant solution to the exploration-exploitation trade-off in reinforcement learning. However most investigations of Bayesian reinforcement learning to date focus on the standard Markov Decision Processes (MDPs). The primary focus of this paper is to extend these ideas to the case of partially observable domains, by introducing th...

متن کامل

Sequential Constant Size Compressors for Reinforcement Learning

2011

Linus Gisslén Matthew D. Luciw Vincent Graziano Jürgen Schmidhuber

Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential Constant-Size Compressor (SCSC). The SCSC takes the form of a sequential Recurrent Auto-Associative Me...

متن کامل

Partially Observable Markov Decision Process Approximations for Adaptive Sensing

Journal: :Discrete Event Dynamic Systems 2009

Edwin K. P. Chong Christopher M. Kreucher Alfred O. Hero

Adaptive sensing involves actively managing sensor resources to achieve a sensing task, such as object detection, classification, and tracking, and represents a promising direction for new applications of discrete event system methods. We describe an approach to adaptive sensing based on approximately solving a partially observable Markov decision process (POMDP) formulation of the problem. Suc...

متن کامل

Convention Emergence in Partially Observable Topologies

2017

James Marchant Nathan Griffiths

In multi-agent systems it is often desirable for agents to adhere to standards of behaviour that minimise clashes and wasting of (limited) resources. In situations where it is not possible or desirable to dictate these standards globally or via centralised control, convention emergence offers a lightweight and rapid alternative. Placing fixed strategy agents within a population, whose interacti...

متن کامل

Optimization of Prostate Biopsy Referral Decisions

Journal: :Manufacturing & Service Operations Management 2012

Jingyu Zhang Brian T. Denton Hari Balasubramanian Nilay D. Shah Brant A. Inman

Prostate cancer is the most common solid tumor in American men and is screened for using prostate-specific antigen (PSA) tests. We report on a non-stationary partially observable Markov decision process (POMDP) for prostate biopsy referral decisions. The core states are the patients’ prostate cancer related health states, and PSA test results are the observations. Transition probabilities and r...

متن کامل

Demonstration of a POMDP Voice Dialer

2008

Jason Williams

This is a demonstration of a voice dialer, implemented as a partially observable Markov decision process (POMDP). A realtime graphical display shows the POMDP’s probability distribution over different possible dialog states, and shows how system output is generated and selected. The system demonstrated here includes several recent advances, including an action selection mechanism which unifies ...

متن کامل