markovian decision process

Multi-View Decision Processes: The Helper-AI Problem

2017

Christos Dimitrakakis David C. Parkes Goran Radanovic Paul Tylkin

We consider a two-player sequential game in which agents have the same reward function but may disagree on the transition probabilities of an underlying Markovian model of the world. By committing to play a specific policy, the agent with the correct model can steer the behavior of the other agent, and seek to improve utility. We model this setting as a multi-view decision process, which we use...

متن کامل

ANALYSIS OF FINITE BUFFER RENEWAL INPUT QUEUE WITH BALKING AND MARKOVIAN SERVICE PROCESS

Journal: International Journal of Mathematical Modelling and Computations 2015

Jyothsna Kanithi Vijaya Laxmi Pikkala

This paper presents the analysis of a renewal input finite buffer queue wherein the customers can decide either to join the queue with a probability or balk. The service process is Markovian service process ($MSP$) governed by an underlying $m$-state Markov chain. Employing the supplementary variable and imbedded Markov chain techniques, the steady-state system length distributions at pre...

متن کامل

TBPE: a Routing Algorithm based on QoS Estimation and Forecast

1999

Vincenzo Eramo Ileana Martino Ugo Mocci Caterina Scoglio

In this paper we propose the TBPE Transient Blocking Probability Estimation routing algorithm, an adaptive method based on Grade of Service forecast requiring limited network monitoring and computational complexity. This routing scheme is compared with the well-known routing algorithm class MDP in which Markovian Decision Process theory is used for the QoS forecast. Both routing algorithms are ...

متن کامل

Efficient Computation of Time-Bounded Reachability Probabilities in Uniform Continuous-Time Markov Decision Processes

2004

Christel Baier Boudewijn R. Haverkort Holger Hermanns Joost-Pieter Katoen

A continuous-time Markov decision process (CTMDP) is a generalization of a continuous-time Markov chain in which both probabilistic and nondeterministic choices co-exist. This paper presents an efficient algorithm to compute the maximum (or minimum) probability to reach a set of goal states within a given time bound in a uniform CTMDP, i.e., a CTMDP in which the delay time distribution per stat...

متن کامل

Evaluation of an analytic, approximate formula for the time-varying SIS prevalence in different networks

2017

Qiang Liu Piet Van Mieghem

One of the most important quantities of the exact Markovian SIS epidemic process is the time-dependent prevalence, which is the average fraction of infected nodes. Unfortunately, the Markovian SIS epidemic model features an exponentially increasing computational complexity with growing network size N . In this paper, we evaluate a recently proposed analytic approximate prevalence function intro...

متن کامل

Apprentissage par renforcement pour les processus décisionnels de Markov partiellement observés Apprendre une extension sélective du passé

Journal: :Revue d'Intelligence Artificielle 2003

Alain Dutech Manuel Samuelides

We present a new algorithm that extends the Reinforcement Learning framework to Partially Observed Markov Decision Processes (POMDP). The main idea of our method is to build a state extension, called exhaustive observable, which allow us to define a next processus that is Markovian. We bring the proof that solving this new process, to which classical RL methods can be applied, brings an optimal...

متن کامل

Using Randomization to Break the Curse of Dimensionality

1996

John Rust

This paper introduces random versions of successive approximations and multigrid algorithms for computing approximate solutions to a class of finite and infinite horizon Markovian decision problems (MDPs). We prove that these algorithms succeed in breaking the “curse of dimensionality” for a subclass of MDPs known as discrete decision processes (DDPs).

متن کامل

A Temporal Logic for Stochastic Multi-Agent Systems

2008

Wojciech Jamroga

Typical analysis of Markovian models of processes refers only to the expected utility that can be obtained by the process. On the other hand, modal logic offers a systematic method of characterizing processes by combining various modal operators. A multivalued temporal logic for Markov chains and Markov decision processes has been recently proposed in [1]. Here, we discuss how it can be extende...

متن کامل

Maximizing throughput in finite-source parallel queue systems

Journal: :European Journal of Operational Research 2012

Mohammad Delasay Bora Kolfal Armann Ingolfsson

Motivated by the dispatching of trucks to shovels in surface mines, we study optimal routing in a Markovian finite-source, multi-server queueing system with heterogeneous servers, each with a separate queue. We formulate the problem of routing customers to servers to maximize the system throughput as a Markov Decision Process. When the servers are homogenous, we demonstrate that the Shortest Qu...

متن کامل

Lorentz process with shrinking holes in a wall.

Journal: :Chaos 2012

Péter Nándori Domokos Szász

We ascertain the diffusively scaled limit of a periodic Lorentz process in a strip with an almost reflecting wall at the origin. Here, almost reflecting means that the wall contains a small hole waning in time. The limiting process is a quasi-reflected Brownian motion, which is Markovian, but not strong Markovian. Local time results for the periodic Lorentz process, having independent interest,...

متن کامل