markov decision process graph theory

نتایج جستجو برای: markov decision process graph theory

تعداد نتایج: 2385831 فیلتر نتایج به سال:

On Stable Social Laws and Qualitative Equilibrium for Risk-Averse Agents

1996

Moshe Tennenholtz

This paper introduces and investigates the notion of qualitative equilibria, or stable social laws, in the context of qualitative decision making. Previous work in qualitative decision theory has used the maximin decision criterion for modelling qualitative decision making. When several decision-makers share a common environment, a corresponding notion of equilibrium can be deened. This notion ...

متن کامل

Existence of Optimal Policies for Semi-Markov Decision Processes Using Duality for Infinite Linear Programming

Journal: :SIAM J. Control and Optimization 2006

Diego Klabjan Daniel Adelman

Semi-Markov decision processes on Borel spaces with deterministic kernels have many practical applications, particularly in inventory theory. Most of the results from general semi-Markov decision processes do not carry over to a deterministic kernel since such a kernel does not provide “smoothness.” We develop infinite dimensional linear programming theory for a general stochastic semi-Markov d...

متن کامل

Markov Decision Process for Curling Strategies

Journal: :Journal of Korean Institute of Industrial Engineers 2016

متن کامل

Quantile Markov Decision Process

Journal: :CoRR 2017

Xiaocheng Li Huaiyang Zhong Margaret L. Brandeau

In this paper, we consider the problem of optimizing the quantiles of the cumulative rewards of Markov Decision Processes (MDP), to which we refers as Quantile Markov Decision Processes (QMDP). Traditionally, the goal of a Markov Decision Process (MDP) is to maximize expected cumulative reward over a defined horizon (possibly to be infinite). In many applications, however, a decision maker may ...

متن کامل

Geodesic Distance-based Kernel Construction for Gaussian Process Value Function Approximation

2011

HUNOR JAKAB

Finding accurate approximations to state and action value functions is essential in Reinforcement learning tasks on continuous Markov Decision Processes. Using Gaussian processes as function approximators we can simultaneously represent model confidence and generalize to unvisited states. To improve the accuracy of the value function approximation in this article I present a new method of const...

متن کامل

افزایش رضایتمندی مشتریان از محصولات سازمان به کمک ترکیب سه تکنیک anp ، qfd، bgp در محیط فازی

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه یزد 1388

مریم ظریفیان, محمد باقر فخرزاد, حسن خادمی زارع,

this study considers the level of increase in customer satisfaction by supplying the variant customer requirements with respect to organizational restrictions. in this regard, anp, qfd and bgp techniques are used in a fuzzy set and a model is proposed in order to help the organization optimize the multi-objective decision-making process. the prioritization of technical attributes is the result ...

15 صفحه اول

The Odds of Staying on Budget

2015

Christoph Haase Stefan Kiefer

Given Markov chains and Markov decision processes (MDPs) whose transitions are labelled with non-negative integer costs, we study the computational complexity of deciding whether the probability of paths whose accumulated cost satisfies a Boolean combination of inequalities exceeds a given threshold. For acyclic Markov chains, we show that this problem is PP-complete, whereas it is hard for the...

متن کامل

Random Generation of Deterministic Acyclic Automata Using Markov Chains

2011

Vincent Carnino Sven De Felice

In this article we propose an algorithm, based on Markov chain techniques, to generate random automata that are deterministic, accessible and acyclic. The distribution of the output approaches the uniform distribution on n-state such automata. We then show how to adapt this algorithm in order to generate minimal acyclic automata with n states almost uniformly.

متن کامل

The influence model: a tractable representation for the dynamics of networked Markov chains

2001

Chalee Asavathiratham

In this thesis we introduce and analyze the influence model, a particular but tractable mathematical representation of random, dynamical interactions on networks. Specifically, an influence model consists of a network of nodes, each with a status that evolves over time. The evolution of the status at a node is according to an internal Markov chain, but with transition probabilities that depend ...

متن کامل

POLICY IMPROVEMENT IN MARKOV DECISION PROCESSES AND MARKOV POTENTIAL THEORY

Journal: :Bulletin of Mathematical Statistics 1978

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید