markov decision process graph theory

نتایج جستجو برای: markov decision process graph theory

تعداد نتایج: 2385831 فیلتر نتایج به سال:

A generalized Markov decision process

2017

GARY J. KOEHLER

— In this paper we present a generalized Markov décision process that subsumes the traditional discounted, infinité horizon, finite state and action Markov décision process, VeinotCs discountéd décision processes, and Koehler's generalization of these two problem classes. Résumé. — Nous présentons dans cet article un processus de Markov généralisé qui englobe le processus de décision markovien ...

متن کامل

Analysis of Discrete Markov Process by Probability Flow Graph

Journal: :Transactions of the Society of Instrument and Control Engineers 1967

متن کامل

Counting and exploring sizes of Markov equivalence classes of directed acyclic graphs

Journal: :Journal of Machine Learning Research 2015

Yangbo He Jinzhu Jia Bin Yu

When learning a directed acyclic graph (DAG) model via observational data, one generally cannot identify the underlying DAG, but can potentially obtain a Markov equivalence class. The size (the number of DAGs) of a Markov equivalence class is crucial to infer causal effects or to learn the exact causal DAG via further interventions. Given a set of Markov equivalence classes, the distribution of...

متن کامل

On the Episode Duration Distribution in Fixed-Policy Markov Decision Processes

2010

Itamar Arel Andrew S. Davis

This paper presents a formalism for determining the episode duration distribution in fixed-policy Markov decision processes (MDP). To achieve this goal, we borrow the notion of obtaining the n-step first visit probability from queuing theory, apply it to a Markov chain derived from the MDP, and arrive at the distribution of the episode durations between any two arbitrary states. We illustrate t...

متن کامل

On Markov-Additive Jump Processes

Journal: :Universität Trier, Mathematik/Informatik, Forschungsbericht 2000

Lothar Breuer

In 1995, Pacheco and Prabhu introduced the class of so–called Markov–additive processes of arrivals in order to provide a general class of arrival processes for queueing theory. In this paper, the above class is generalized considerably, including time–inhomogeneous arrival rates, general phase spaces and the arrival space being a general vector space (instead of the finite–dimensional Euclidea...

متن کامل

Hamiltonian Cycles and Singularly Perturbed Markov Chains

Journal: :Math. Oper. Res. 2004

Vladimir Ejov Jerzy A. Filar Minh-Tuan Nguyen

We consider the Hamiltonian cycle problem embedded in a singularly perturbed Markov decision process. We also consider a functional on the space of deterministic policies of the process that consists of the (1,1)-entry of the fundamental matrices of the Markov chains induced by the same policies. We show that when the perturbation parameter, ε, is less than or equal to 1 N2 the Hamiltonian cycl...

متن کامل

The Online Loop-free Stochastic Shortest-Path Problem

2010

Gergely Neu András György Csaba Szepesvári

We consider a stochastic extension of the loop-free shortest path problem with adversarial rewards. In this episodic Markov decision problem an agent traverses through an acyclic graph with random transitions: at each step of an episode the agent chooses an action, receives some reward, and arrives at a random next state, where the reward and the distribution of the next state depend on the act...

متن کامل

utilizing generalized learning automata for finding optimal policies in mmdps

Journal: :journal of computer and robotics 0

samaneh assar faculty of computer and information technology engineering, qazvin branch, islamic azad university, qazvin, iran behrooz masoumi faculty of computer and information technology engineering, qazvin branch, islamic azad university, qazvin, iran

multi agent markov decision processes (mmdps), as the generalization of markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for multi agent reinforcement learning. in this paper, a generalized learning automata based algorithm for finding optimal policies in mmdp is proposed. in the proposed algorithm, mmdp ...

متن کامل

Dynamic Social Choice: Foundations and Algorithms∗

2011

David C. Parkes Ariel D. Procaccia

Social choice theory provides insights into a variety of collective decision making settings, but nowadays some of its tenets are challenged by Internet environments, which call for dynamic decision making under constantly changing preferences. In this paper we model the problem via Markov decision processes (MDP), where the states of the MDP coincide with preference profiles and a (determinist...

متن کامل

PPS: User Manual

2014

Krishnendu Chatterjee Martin Chmelik Ayush Kanodia

– Visualize the textual input POMDP using the DOT language. – Reduce a POMDP with a parity objective, to an equivalent POMDP with a coBüchi objective (a parity objective with only 2 priorities), and visualize the POMDP. – Given a POMDP, with a coBüchi objective, it constructs its belief-observation POMDP Ĝ, and visualizes it. – Given the belief-observation POMDP Ĝ, if there exists an almost-sur...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید