نتایج جستجو برای: markov decision process graph theory

تعداد نتایج: 2385831  

1996
Moshe Tennenholtz

This paper introduces and investigates the notion of qualitative equilibria, or stable social laws, in the context of qualitative decision making. Previous work in qualitative decision theory has used the maximin decision criterion for modelling qualitative decision making. When several decision-makers share a common environment, a corresponding notion of equilibrium can be deened. This notion ...

Journal: :SIAM J. Control and Optimization 2006
Diego Klabjan Daniel Adelman

Semi-Markov decision processes on Borel spaces with deterministic kernels have many practical applications, particularly in inventory theory. Most of the results from general semi-Markov decision processes do not carry over to a deterministic kernel since such a kernel does not provide “smoothness.” We develop infinite dimensional linear programming theory for a general stochastic semi-Markov d...

Journal: :Journal of Korean Institute of Industrial Engineers 2016

Journal: :CoRR 2017
Xiaocheng Li Huaiyang Zhong Margaret L. Brandeau

In this paper, we consider the problem of optimizing the quantiles of the cumulative rewards of Markov Decision Processes (MDP), to which we refers as Quantile Markov Decision Processes (QMDP). Traditionally, the goal of a Markov Decision Process (MDP) is to maximize expected cumulative reward over a defined horizon (possibly to be infinite). In many applications, however, a decision maker may ...

2011
HUNOR JAKAB

Finding accurate approximations to state and action value functions is essential in Reinforcement learning tasks on continuous Markov Decision Processes. Using Gaussian processes as function approximators we can simultaneously represent model confidence and generalize to unvisited states. To improve the accuracy of the value function approximation in this article I present a new method of const...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه یزد 1388

this study considers the level of increase in customer satisfaction by supplying the variant customer requirements with respect to organizational restrictions. in this regard, anp, qfd and bgp techniques are used in a fuzzy set and a model is proposed in order to help the organization optimize the multi-objective decision-making process. the prioritization of technical attributes is the result ...

2015
Christoph Haase Stefan Kiefer

Given Markov chains and Markov decision processes (MDPs) whose transitions are labelled with non-negative integer costs, we study the computational complexity of deciding whether the probability of paths whose accumulated cost satisfies a Boolean combination of inequalities exceeds a given threshold. For acyclic Markov chains, we show that this problem is PP-complete, whereas it is hard for the...

2011
Vincent Carnino Sven De Felice

In this article we propose an algorithm, based on Markov chain techniques, to generate random automata that are deterministic, accessible and acyclic. The distribution of the output approaches the uniform distribution on n-state such automata. We then show how to adapt this algorithm in order to generate minimal acyclic automata with n states almost uniformly.

2001
Chalee Asavathiratham

In this thesis we introduce and analyze the influence model, a particular but tractable mathematical representation of random, dynamical interactions on networks. Specifically, an influence model consists of a network of nodes, each with a status that evolves over time. The evolution of the status at a node is according to an internal Markov chain, but with transition probabilities that depend ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید