نتایج جستجو برای: markov decision process graph theory

تعداد نتایج: 2385831  

Journal: :Foundations and Trends® in Stochastic Systems 2006

Journal: :Memoirs of the Faculty of Science, Kyushu University. Series A, Mathematics 1975

Journal: :CoRR 2017
Eric Mazumdar Roy Dong Vicenç Rúbies Royo Claire J. Tomlin S. Shankar Sastry

We formulate a multi-armed bandit (MAB) approach to choosing expert policies online in Markov decision processes (MDPs). Given a set of expert policies trained on a state and action space, the goal is to maximize the cumulative reward of our agent. The hope is to quickly find the best expert in our set. The MAB formulation allows us to quantify the performance of an algorithm in terms of the re...

1997
Craig Boutilier

Much recent research in decision theoretic planning has adopted Markov decision processes (MDPs) as the model of choice, and has attempted to make their solution more tractable by exploiting problem structure. One particular algorithm, structured policy construction achieves this by means of a decision theoretic analog of goal regression, using action descriptions based on Bayesian networks wit...

1997
Craig Boutilier Ronen I. Brafman Christopher W. Geib

We describe an approach to goal decomposition for a certain class of Markov decision processes (MDPs). An abstraction mechanism is used to generate abstract MDPs associated with different objectives, and several methods for merging the policies for these different objectives are considered. In one technique, causal (least-commitment) structures are generated for abstract policies and plan mergi...

Journal: :CoRR 2009
Sarah Filippi Olivier Cappé Aurélien Garivier

We consider the task of opportunistic channel access in a primary system composed of independent Gilbert-Elliot channels where the secondary (or opportunistic) user does not dispose of a priori information regarding the statistical characteristics of the system. It is shown that this problem may be cast into the framework of model-based learning in a specific class of Partially Observed Markov ...

2004
Yi Yang

Gains and Losses in the Eyes of the Beholder: A Comparative Study of Foreign Policy Decision Making Under Risk. (December 2004) Yi Yang, B.A., Foreign Affairs College Chair of Advisory Committee: Dr. Alex Mintz Prospect theory is a descriptive model of individual decision-making under risk (Kahneman and Tversky 1979). The central tenet of prospect theory posits that the risk orientation of deci...

2013
Adam Vogel Max Bodoia Christopher Potts Daniel Jurafsky

Grice characterized communication in terms of the cooperative principle, which enjoins speakers to make only contributions that serve the evolving conversational goals. We show that the cooperative principle and the associated maxims of relevance, quality, and quantity emerge from multi-agent decision theory. We utilize the Decentralized Partially Observable Markov Decision Process (Dec-POMDP) ...

Journal: :J. Artif. Intell. Res. 2007
Alan Fern Sriraam Natarajan Kshitij Judah Prasad Tadepalli

There is a growing interest in intelligent assistants for a variety of applications from organizing tasks for knowledge workers to helping people with dementia. In this paper, we present and evaluate a decision-theoretic framework that captures the general notion of assistance. The objective is to observe a goal-directed agent and to select assistive actions in order to minimize the overall cos...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید