Probably Approximately Correct Learning in Stochastic Games with Temporal Logic Specifications
نویسندگان
چکیده
We consider a controller synthesis problem in turnbased stochastic games with both a qualitative linear temporal logic (LTL) constraint and a quantitative discounted-sum objective. For each case in which the LTL specification is realizable and can be equivalently transformed into a deterministic Buchi automaton, we show that there always exists a memoryless almost-sure winning strategy that is "-optimal with respect to the discounted-sum objective for any arbitrary positive ". Building on the idea of the R-MAX algorithm, we propose a probably approximately correct (PAC) learning algorithm that can learn such a strategy efficiently in an online manner with a-priori unknown reward functions and unknown transition distributions. To the best of our knowledge, this is the first result on PAC learning in stochastic games with independent quantitative and qualitative objectives.
منابع مشابه
Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints
We consider synthesis of controllers that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments. We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities. The solution we develop builds on the so-called model-based probably approximately correct Mark...
متن کاملSymbolic Models for Stochastic Switched Systems: A Discretization and a Discretization-Free Approach
Stochastic switched systems are a relevant class of stochastic hybrid systems with probabilistic evolution over a continuous domain and control-dependent discrete dynamics over a finite set of modes. In the past few years several different techniques have been developed to assist in the stability analysis of stochastic switched systems. However, more complex and challenging objectives related t...
متن کاملLearning Cooperative Games
This paper explores a PAC (probably approximately correct) learning model in cooperative games. Specifically, we are given m random samples of coalitions and their values, taken from some unknown cooperative game; can we predict the values of unseen coalitions? We study the PAC learnability of several well-known classes of cooperative games, such as network flow games, threshold task games, and...
متن کاملApproximately Bisimilar Symbolic Models for Stochastic Switched Systems
Stochastic switched systems are a class of continuous-time dynamical models with probabilistic evolution over a continuous domain and controldependent discrete dynamics over a finite set of locations (modes). As such, they represent a subclass of general stochastic hybrid systems. While the literature has witnessed recent progress in the dynamical analysis and controller synthesis for the stabi...
متن کاملSafe Control under Uncertainty
Controller synthesis for hybrid systems that satisfy temporal specifications expressing various system properties is a challenging problem that has drawn the attention of many researchers. However, making the assumption that such temporal properties are deterministic is far from the reality. For example, many of the properties the controller has to satisfy are learned through machine learning t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016