markov reward models

نتایج جستجو برای: markov reward models

تعداد نتایج: 981365 فیلتر نتایج به سال:

Limiting Covariance in Markov-renewal Processes

2015

William S. Jewell

General additive functions called rewards are defined on a "regular" finite-state Markov-renewal process. The asymptotic form of the mean total reward in [0,t] has previously been obtained, and it is known that the total rewards are joint-normally distributed as t -► oo. This paper finds the dominant asymptotic term in the covariance of the total rewetrds as a simple function of the moments of ...

متن کامل

The complexity of unobservable nite - horizon Markov decision processes

1996

Martin Mundhenk

Markov Decision Processes (MDPs) model controlled stochastic systems. Like Markov chains, an MDP consists of states and probabilistic transitions; unlike Markov chains, there is assumed to be an outside controller who chooses an action (with its associated transition matrix) at each step of the process, according to some strategy or policy. In addition, each state and action pair has an associa...

متن کامل

Markov Decision Processes with Slow Scale Periodic Decisions

Journal: :Math. Oper. Res. 2003

Matthew W. Jacobson Nahum Shimkin Adam Shwartz

We consider a class of discrete time, dynamic decision-making models which we refer to as Periodically Time-Inhomogeneous Markov Decision Processes (PTMDPs). In these models, the decision-making horizon can be partitioned into intervals, called slow scale cycles, of N +1 epochs. The transition law and reward function are time-homogeneous over the first N epochs of each slow scale cycle, but dis...

متن کامل

context dependent modeling in continuous speech recognition based on a persian phonetic decision tree

Journal: :the modares journal of electrical engineering 2003

seyed hosein shams seyed mohammad ahadi

context-dependent modeling is a well-known approach to increase modeling accuracy in continuous speech recognition. the most common way to implement this approach is via triphone modeling. nevertheless, the large number of such models results in several problems in model training, whilst the robust training of such models is often hardly obtained. one approach to solve this problem is via param...

متن کامل

Recurrence-Relation-Based Reward Model for Performability Evaluation of Embedded Systems

2007

Ann T. Tai Kam S. Tso William H. Sanders

Many embedded systems behave as discrete-time semi-Markov processes (DTSMPs). For those systems, performability measures, especially when specified as an accumulated reward, are often difficult to evaluate analytically. In this article, we informally describe an approach that uses a recurrence-relation-based (RRB) reward model for performability evaluation of systems exhibiting DTSMP behavior. ...

متن کامل

Hedging Bets in Markov Decision Processes

2016

Rajeev Alur Marco Faella Sampath Kannan Nimit Singhania

The classical model of Markov decision processes with costs or rewards, while widely used to formalize optimal decision making, cannot capture scenarios where there are multiple objectives for the agent during the system evolution, but only one of these objectives gets actualized upon termination. We introduce the model of Markov decision processes with alternative objectives (MDPAO) for formal...

متن کامل

Communication Networks CAC and routing for multi-service networks with blocked wide-band calls delayed, part I: exact link MDP framework

Journal: :European Transactions on Telecommunications 2006

Ernst Nordström Zbigniew Dziong

In this paper, we study the call admission control (CAC) and routing issue in multi-service networks. Two categories of calls are considered: a narrow-band (NB) with blocked calls cleared and a wide-band (WB) with blocked calls delayed. The objective function is formulated as reward maximisation with penalty for delay. The optimisation is subject to quality of service (QoS) constraints and, pos...

متن کامل

Correlation Decay in Random Decision Networks

Journal: :Math. Oper. Res. 2014

David Gamarnik David A. Goldberg Theophane Weber

Abstract We consider a decision network on an undirected graph in which each node corresponds to a decision variable, and each node and edge of the graph is associated with a reward function whose value depends only on the variables of the corresponding nodes. The goal is to construct a decision vector which maximizes the total reward. This decision problem encompasses a variety of models, incl...

متن کامل

Discrete Time Markov Reward Processes a Motor Car Insurance Example

Journal: :Technology and Investment 2010

متن کامل

Hierarchical composition and aggregation of state-based availability and performability models

Journal: :IEEE Trans. Reliability 2003

Mark Lanus Liang Yin Kishor S. Trivedi

Telecommunication systems are large and complex, consisting of multiple intelligent modules in shelves, multiple shelves in frames, and multiple frames to compose a single network element. In the availability and performability analysis of such a complex system, combinatorial models are computationally efficient but have limited expressive power. State-based models are expressive but computatio...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید