markov decision process graph theory

نتایج جستجو برای: markov decision process graph theory

تعداد نتایج: 2385831 فیلتر نتایج به سال:

Monotonicity in Markov Reward and Decision Chains: Theory and Applications

Journal: :Foundations and Trends® in Stochastic Systems 2006

متن کامل

FUNCTIONAL EQUATIONS AND MARKOV POTENTIAL THEORY IN STOPPED DECISION PROCESSES

Journal: :Memoirs of the Faculty of Science, Kyushu University. Series A, Mathematics 1975

متن کامل

The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes

Journal: :Math. Oper. Res. 2003

Shie Mannor Nahum Shimkin

متن کامل

A Multi-Armed Bandit Approach for Online Expert Selection in Markov Decision Processes

Journal: :CoRR 2017

Eric Mazumdar Roy Dong Vicenç Rúbies Royo Claire J. Tomlin S. Shankar Sastry

We formulate a multi-armed bandit (MAB) approach to choosing expert policies online in Markov decision processes (MDPs). Given a set of expert policies trained on a state and action space, the goal is to maximize the cumulative reward of our agent. The hope is to quickly find the best expert in our set. The MAB formulation allows us to quantify the performance of an algorithm in terms of the re...

متن کامل

Correlated Action Effects in Decision Theoretic Regression

1997

Craig Boutilier

Much recent research in decision theoretic planning has adopted Markov decision processes (MDPs) as the model of choice, and has attempted to make their solution more tractable by exploiting problem structure. One particular algorithm, structured policy construction achieves this by means of a decision theoretic analog of goal regression, using action descriptions based on Bayesian networks wit...

متن کامل

Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning

1997

Craig Boutilier Ronen I. Brafman Christopher W. Geib

We describe an approach to goal decomposition for a certain class of Markov decision processes (MDPs). An abstraction mechanism is used to generate abstract MDPs associated with different objectives, and several methods for merging the policies for these different objectives are considered. In one technique, causal (least-commitment) structures are generated for abstract policies and plan mergi...

متن کامل

Regret Bounds for Opportunistic Channel Access

Journal: :CoRR 2009

Sarah Filippi Olivier Cappé Aurélien Garivier

We consider the task of opportunistic channel access in a primary system composed of independent Gilbert-Elliot channels where the secondary (or opportunistic) user does not dispose of a priori information regarding the statistical characteristics of the system. It is shown that this problem may be cast into the framework of model-based learning in a specific class of Partially Observed Markov ...

متن کامل

Gains and losses in the eyes of the beholder: a comparative study of foreign policy decision making under risk

2004

Yi Yang

Gains and Losses in the Eyes of the Beholder: A Comparative Study of Foreign Policy Decision Making Under Risk. (December 2004) Yi Yang, B.A., Foreign Affairs College Chair of Advisory Committee: Dr. Alex Mintz Prospect theory is a descriptive model of individual decision-making under risk (Kahneman and Tversky 1979). The central tenet of prospect theory posits that the risk orientation of deci...

متن کامل

Emergence of Gricean Maxims from Multi-Agent Decision Theory

2013

Adam Vogel Max Bodoia Christopher Potts Daniel Jurafsky

Grice characterized communication in terms of the cooperative principle, which enjoins speakers to make only contributions that serve the evolving conversational goals. We show that the cooperative principle and the associated maxims of relevance, quality, and quantity emerge from multi-agent decision theory. We utilize the Decentralized Partially Observable Markov Decision Process (Dec-POMDP) ...

متن کامل

A Decision-Theoretic Model of Assistance

Journal: :J. Artif. Intell. Res. 2007

Alan Fern Sriraam Natarajan Kshitij Judah Prasad Tadepalli

There is a growing interest in intelligent assistants for a variety of applications from organizing tasks for knowledge workers to helping people with dementia. In this paper, we present and evaluate a decision-theoretic framework that captures the general notion of assistance. The objective is to observe a goal-directed agent and to select assistive actions in order to minimize the overall cos...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید