نتایج جستجو برای: markovian decision process

تعداد نتایج: 1587387  

1999
Fahiem Bacchus Craig Boutilier Adam Grove

Markov decision processes (MDPs) are a very popular tool for decision theoretic planning (DTP), partly because of the welldeveloped, expressive theory that includes effective solution techniques. But the Markov assumption-that dynamics and rewards depend on the current state only, and not on historyis often inappropriate. This is especially true of rewards: we frequently wish to associate rewar...

Redundancy technique is known as a way to enhance the reliability and availability of non-reparable systems, but for repairable systems, another factor is getting prominent called as the number of maintenance resources. In this study, availability optimization of series-parallel systems is modelled by using Markovian process by which the number of maintenance resources is located into the obje...

2003
Eiji Mizutani Stuart E. Dreyfus

We describe how multi-stage non-Markovian decision problems can be solved using actor-critic reinforcement learning by assuming that a discrete version of CohenGrossberg node dynamics describes the node-activation computations of a neural network (NN). Our NN (i.e., agent) is capable of rendering the process Markovian implicitly and automatically in a totally model-free fashion without learning...

2002
Matthias Kuntz Markus Siegle

A new denotational semantics for a variant of the stochastic process algebra TIPP is presented, which maps process terms to Multiterminal binary decision diagrams. It is shown that the new semantics is Markovian bisimulation equivalent to the standard SOS semantics. The paper also addresses the difficult question of keeping the underlying state space minimal at every construction step.

Akhavan Niaki, Fallah Nezhad,

We develop an optimization model based on Markovian approach to determine the optimum value of thresholds in a proposed acceptance sampling design. Consider an acceptance sampling plan where items are inspected and when the number of conforming items between successive defective items falls below a lower control threshold value, then the batch is rejected, and if it falls above a control thresh...

Journal: :Journal of Mathematical Analysis and Applications 1975

Journal: :Journal of Mathematical Analysis and Applications 1985

Journal: :The Annals of Mathematical Statistics 1968

2004
Takeshi Yoshikawa Yuki Kanazawa Masahito Kurihara

Reinforcement learning is a framing of enabling agents to learn from interaction with environments. It has focused generally on Markov decision process (MDP) domains, but a domain may be non-Markovian in the real world. In this paper, we develop a new description of macro-actions for non-Markov decision process (NMDP) domains in reinforcement learning. A macro-action is an action control struct...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید