partially non

Hybrid POMDP Algorithms

2006

Sébastien Paquet Brahim Chaib-draa Stéphane Ross

When an agent evolves in a partially observable environment, it has to deal with uncertainties when choosing its actions. An efficient model for such environments is to use partially observable Markov decision processes (POMDPs). Many algorithms have been developed for POMDPs. Some use an offline approach, learning a complete policy before the execution. Others use an online approach, construct...

متن کامل

A Survey of POMDP Applications

2003

Anthony R. Cassandra

An increasing number of researchers in many areas are becoming interested in the application of the partially observable Markov decision process (pomdp) model to problems with hidden state. This model can account for both state transition and observation uncertainty. The majority of recent research interest in the pomdp model has been in the artificial intelligence community and as such, has be...

متن کامل

Learning Policies for Partially Observable Environments: Scaling Up

1995

Michael L. Littman Anthony R. Cassandra Leslie Pack Kaelbling

Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the study of pomdp's is motivated by a need to address realistic problems, existing techniques for nding optimal behavior do not appear to scale well and have been unable to nd satisfactory policies for problem...

متن کامل

Planning under the Uncertainty of the Technical Analysis of Stock Markets

2010

Augusto Cesar Espíndola Baffa Angelo E. M. Ciarlini

The stock market can be considered a nondeterministic and partially observable domain, because investors never know all information that affects prices and the result of an investment is always uncertain. Technical Analysis methods demand only data that are easily available, i.e. the series of prices and trade volumes, and are then very useful to predict current price trends. Analysts have howe...

متن کامل

Regionalized Policy Representation for Reinforcement Learning in POMDPs

2007

Xuejun Liao Hui Li Ronald Parr Lawrence Carin

Many decision-making problems can be formulated in the framework of a partially observable Markov decision process (POMDP) [5]. The optimality of decisions relies on the accuracy of the POMDP model as well as the policy found for the model. In many applications the model is unknown and learned empirically based on experience, and building a model is just as difficult as finding the associated p...

متن کامل

Producing efficient error-bounded solutions for transition independent decentralized mdps

2013

Jilles Steeve Dibangoye Christopher Amato Arnaud Doniec François Charpillet

There has been substantial progress on algorithms for single-agent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds and fast convergence rates. Despite significant efforts, no algorithms for solving decentralized POMDPs benefit from these properties, leading ...

متن کامل

Dynamic Programming for Partially Observable Stochastic Games

2004

Eric A. Hansen Daniel S. Bernstein Shlomo Zilberstein

We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterative elimination of dominated strategies in normal form games. We prove that it iteratively eliminates very weakly dominated strategies without first forming the normal form r...

متن کامل

A Decision-Theoretic Approach to Task Assistance for Persons with Dementia

2005

Jennifer Boger Pascal Poupart Jesse Hoey Craig Boutilier Geoff Fernie Alex Mihailidis

Cognitive assistive technologies that aid people with dementia (such as Alzheimer’s disease) hold the promise to provide such people with an increased level of independence. However, to realize this promise, such systems must account for the specific needs and preferences of individuals. We argue that this form of customization requires a sequential, decision-theoretic model of interaction. We ...

متن کامل

On the Average Cost Optimality Equation and the Structure of Optimal Policies for Partially Observable Markov Decision Processes*

2001

Emmanuel Fernández-Gaucherand Aristotle Arapostathis Steven I. Marcus

We consider partially observable Markov decision processes with finite or countably infinite (core) state and observation spaces and finite action set. Following a standard approach, an equivalent completely observed problem is formulated, with the same finite action set but with an uncountable state space, namely the space of probability distributions on the original core state space. By devel...

متن کامل

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

Journal: :Journal of Machine Learning Research 2011

Stéphane Ross Joelle Pineau Brahim Chaib-draa Pierre Kreitmann

Bayesian learning methods have recently been shown to provide an elegant solution to the exploration-exploitation trade-off in reinforcement learning. However most investigations of Bayesian reinforcement learning to date focus on the standard Markov Decision Processes (MDPs). The primary focus of this paper is to extend these ideas to the case of partially observable domains, by introducing th...

متن کامل