markov reward models

Fuzzy stopping problems in continuous-time fuzzy stochastic systems

Journal: :Fuzzy Sets and Systems 2003

Yuji Yoshida Masami Yasuda Jun-ichi Nakagami Masami Kurano

In a continuous-time fuzzy stochastic system, a stopping model with fuzzy stopping times is presented. The optimal fuzzy stopping times are given under an assumption of regularity for stopping rules. Also, the optimal fuzzy reward is characterized as a unique solution of an optimality equation under a differentiability condition. An example in the Markov models is discussed.

متن کامل

Bayesian Policy Learning with Trans-Dimensional MCMC

2007

Matthew D. Hoffman Arnaud Doucet Nando de Freitas Ajay Jasra

A recently proposed formulation of the stochastic planning and control problem as one of parameter estimation for suitable artificial statistical models has led to the adoption of inference algorithms for this notoriously hard problem. At the algorithmic level, the focus has been on developing Expectation-Maximization (EM) algorithms. In this paper, we begin by making the crucial observation th...

متن کامل

Trans-dimensional MCMC for Bayesian Policy Learning

2007

Matt Hoffman Arnaud Doucet Ajay Jasra

A recently proposed formulation of the stochastic planning and control problem as one of parameter estimation for suitable artificial statistical models has led to the adoption of inference algorithms for this notoriously hard problem. At the algorithmic level, the focus has been on developing Expectation-Maximization (EM) algorithms. In this paper, we begin by making the crucial observation th...

متن کامل

Analysing reward measures of LARES performability models by discontinuous Markov chains

Journal: :IJCCBS 2017

Alexander Gouberman Martin Riedl Markus Siegle

This paper presents a new method for specifying and analysing Markovian performability models. An extension of the LARES modelling language is considered which offers both delayed and immediate transitions, as well as rate and impulse rewards on whose basis different types of reward measures can be defined. The paper describes the evaluation path, starting from the modular and hierarchical LARE...

متن کامل

Exponential Lower Bounds for Policy Iteration

2010

John Fearnley

We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.

متن کامل

Transition Entropy in Partially Observable Markov Decision Processes

2006

Francisco S. Melo M. Isabel Ribeiro

This paper proposes a new heuristic algorithm suitable for real-time applications using partially observable Markov decision processes (POMDP). The algorithm is based in a reward shaping strategy which includes entropy information in the reward structure of a fully observable Markov decision process (MDP). This strategy, as illustrated by the presented results, exhibits near-optimal performance...

متن کامل

Compositional Reduction of Performability Models based on Stochastic Process Algebras

2007

Ulrich Klehmet Markus Siegle

Stochastic Process Algebras (SPA) have been proposed as compositional specification formalisms for quantitative models. Here we apply these compositional features to SPAs extended by rewards. State space reduction of performability models can be achieved based on the behaviour-preserving notion of Markov Reward Bisimulation. For a framework extended by immediate actions we develop a new equival...

متن کامل

Forecasting time and place of earthquakes using a Semi-Markov model (with case study in Tehran province)

Journal: Journal of Industrial Engineering, International 2012

Ramin Sadeghian

The paper examines the application of semi-Markov models to the phenomenon of earthquakes in Tehran province. Generally, earthquakes are not independent of each other, and time and place of earthquakes are related to previous earthquakes; moreover, the time between earthquakes affects the pattern of their occurrence; thus, this occurrence can be likened to semi-Markov models. ...

متن کامل

An effective numerical method to compute the moments of the completion time of Markov reward models

Journal: :Computers & Mathematics with Applications 1998

متن کامل

Channel Allocation with Recovery Strategy in Wireless Networks

Journal: :European Transactions on Telecommunications 2000

Yue Ma James J. Han Kishor S. Trivedi

With the increasing penetration of wireless communications systems, customers are expecting the same level of service, reliability and performance from the wireless communication systems as the traditional wire-line networks. Due to the dynamic environment, such as the roaming of mobile subscribers, maintaining a high radio frequency (RF) availability is one of the most challenging aspects in w...

متن کامل