نتایج جستجو برای: markov decision process graph theory

تعداد نتایج: 2385831  

2005
Zinovi Rabinovich Jeffrey S. Rosenschein

In this paper we introduce a novel approach to continual planning and control, called Dynamics Based Control (DBC). The approach is similar in spirit to the Actor-Critic [6] approach to learning and estimation-based differential regulators of classical control theory [12]. However, DBC is not a learning algorithm, nor can it be subsumed within models of standard control theory. We provide a gen...

2004
Eyal Even-Dar Sham M. Kakade Yishay Mansour

We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts setting, we address the question of how well can an agent do when compared to the reward achieved under the best stationary policy over time. We provide efficient algorithms, which have regret bounds...

2008
Nicola Galli

The subject of this paper are Markov chains arising from the study of random graphs. The processes into consideration are interpreted as a discrete-time disintegration of radioactive atoms with anomalies. Our approach is based on (multivariate) generating functions which allow both exact analyses and asymptotic estimates. In particular, we investigate a traversing algorithm on random graphs and...

2008
Debajyoti Ray Brooks King-Casas P. Read Montague Peter Dayan

Classical game theoretic approaches that make strong rationality assumptions have difficulty modeling human behaviour in economic games. We investigate the role of finite levels of iterated reasoning and non-selfish utility functions in a Partially Observable Markov Decision Process model that incorporates game theoretic notions of interactivity. Our generative model captures a broad class of c...

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

Journal: :Proceedings of the AAAI Conference on Artificial Intelligence 2020

Journal: :Indonesian Journal of Electrical Engineering and Computer Science 2020

Journal: :J. Complex Networks 2015
Jaideep Ray Ali Pinar Seshadhri Comandur

Markov chains are convenient means of generating realizations of networks with a given (joint or otherwise) degree distribution, since they simply require a procedure for rewiring edges. The major challenge is to find the right number of steps to run such a chain, so that we generate truly independent samples. Theoretical bounds for mixing times of these Markov chains are too large to be practi...

Journal: :Discrete Mathematics 1997
Davide Crippa Klaus Simon

We consider a sequence of integer{valued random variables Xn; n 1; representing a special Markov process with transition probability the transition probability is given by n;` = q n+`+ and n;` = 1 ? q n+`+ , we can nd closed forms for the distribution and the moments of the corresponding random variables, showing that they involve functions such as the q{binomial coeecients and the q{Stirling n...

Journal: :journal of industrial engineering, international 2011
n javid a makui

this paper assumes the cell formation problem as a distributed decision network. it proposes an approach based on application and extension of information theory concepts, in order to analyze informational complexity in an agent- based system, due to interdependence between agents. based on this approach, new quantitative concepts and definitions are proposed in order to measure the amount of t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید