نتایج جستجو برای: markov reward models

تعداد نتایج: 981365  

2013
Yang Gao Ernst Moritz Hahn Naijun Zhan Lijun Zhang

We present CCMC (Conditional CSL Model Checker), a model checker for continuous-time Markov chains (CTMCs) with respect to formulas specified in continuoustime stochastic logic (CSL). Existing CTMC model checkers such as PRISM or MRMC handle only binary CSL until path formulas. CCMC is the first tool that supports algorithms for analyzing multiple until path formulas. Moreover, CCMC supports a ...

Journal: :Math. Meth. of OR 1997
Apostolos Burnetas Michael N. Katehakis

Consider a finite state irreducible Markov reward chain. It is shown that there exist simulation estimates and confidence intervals for the expected first passage times and rewards as well as the expected average reward, with 100% coverage probability. The length of the confidence intervals converges to zero with probability one as the sample size increases; it also satisfies a large deviations...

2007
M. Baykal-Gűrsoy

Considered are infinite horizon semi-Markov decision processes (SMDPs) with finite state and action spaces. Total expected discounted reward and long-run average expected reward optimality criteria are reviewed. Solution methodology for each criterion is given, constraints and variance sensitivity are also discussed.

Journal: :Systems & Control Letters 2021

We consider a prospect theoretic version of the classical Q-learning algorithm for discounted reward Markov decision processes, wherein controller perceives distorted and noisy future reward, modeled by nonlinearity that accentuates gains under-represents losses relative to reference point. analyze asymptotic behavior scheme analyzing its limiting differential equation using theory monotone dyn...

Journal: :Journal of Mathematical Analysis and Applications 2000

2002
MOGENS BLADT BEATRICE MEINI MARCEL F. NEUTS

We develop algorithms for the computation of the distribution of the total reward accrued during [0, t) in a finite continuous-parameter Markov chain. During sojourns, the reward grows linearly at a rate depending on the state visited. At transitions, there can be instantaneous rewards whose values depend on the states involved in the transition. For moderate values of t, the reward distributio...

2001
Shie Mannor Nahum Shimkin

We consider the problem of learning to attain multiple goals in a dynamic environment, which is initially unknown. In addition, the environment may contain arbitrarily varying elements related to actions of other agents or to non-stationary moves of Nature. This problem is modelled as a stochastic (Markov) game between the learning agent and an arbitrary player, with a vector-valued reward func...

2013
Krishnendu Chatterjee Vojtech Forejt Dominik Wojtczak

We study the problem of achieving a given value in Markov decision processes (MDPs) with several independent discounted reward objectives. We consider a generalised version of discounted reward objectives, in which the amount of discounting depends on the states visited and on the objective. This definition extends the usual definition of discounted reward, and allows to capture the systems in ...

2006
NICO M. VAN DIJK

As an extension of the discrete-time case, this note investigates the variance of the total cumulative reward for continuous-time Markov reward chains with finite state spaces. The results correspond to discrete-time results. In particular, the variance growth rate is shown to be asymptotically linear in time. Expressions are provided to compute this growth rate.

1999
Tapas K. Das Abhijit Gosavi Sridhar Mahadevan Nicholas Marchalleck

A large class of problems of sequential decision making under uncertainty, of which the underlying probability structure is a Markov process, can be modeled as stochastic dynamic programs (referred to, in general, as Markov decision problems or MDPs). However, the computational complexity of the classical MDP algorithms, such as value iteration and policy iteration, is prohibitive and can grow ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید