Inverse reinforcement learning for multi-player noncooperative apprentice games
نویسندگان
چکیده
In this paper, we devise inverse reinforcement learning (RL) algorithms for nonlinear continuous-time systems described by multiplayer differential equations. We define a new class of Multi-player Noncooperative Apprentice Games, in which both the expert and learner have N-player control inputs. The games are solved learners reconstructing unknown performance reward functions experts from experts’ trajectories, i.e., states optimal first develop model-based RL algorithm that involves two stages: an stage second (IOC) stage. Our solves IOC as subproblem. therefore provide one possible unified framework dynamic systems. then using neural networks: completely model-free homogeneous inputs; partially heterogeneous Finally, present results simulations, verify validity our proposed algorithms.
منابع مشابه
Multi-agent Inverse Reinforcement Learning for Zero-sum Games
In this paper we introduce a Bayesian framework for solving a class of problems termed Multi-agent Inverse Reinforcement Learning (MIRL). Compared to the well-known Inverse Reinforcement Learning (IRL) problem, MIRL is formalized in the context of a stochastic game rather than a Markov decision process (MDP). Games bring two primary challenges: First, the concept of optimality, central to MDPs,...
متن کاملReinforcement Learning in Multi-agent Games
This article investigates the performance of independent reinforcement learners in multiagent games. Convergence to Nash equilibria and parameter settings for desired learning behavior are discussed for Q-learning, Frequency Maximum Q value (FMQ) learning and lenient Q-learning. FMQ and lenient Q-learning are shown to outperform regular Q-learning significantly in the context of coordination ga...
متن کاملMulti-Agent Systems of Inverse Reinforcement Learners in Complex Games
Reinforcement Learning (RL) allows an agent to discover a suitable policy to achieve a goal. However, interesting problems for RL become complex extremely fast, as a function of the number of features that compose the state space. The proposed research is to decompose a core problem into tasks with only the features required to solve the task. The core agent then uses the reward for the task, w...
متن کاملMulti-class Generalized Binary Search for Active Inverse Reinforcement Learning
This paper addresses the problem of learning a task from demonstration. We adopt the framework of inverse reinforcement learning, where tasks are represented in the form of a reward function. Our contribution is a novel active learning algorithm that enables the learning agent to query the expert for more informative demonstrations, thus leading to more sampleefficient learning. For this novel ...
متن کاملRepairing Multi-Player Games
Synthesis is the automated construction of systems from their specifications. Modern systems often consist of interacting components, each having its own objective. The interaction among the components is modeled by a multi-player game. Strategies of the components induce a trace in the game, and the objective of each component is to force the game into a trace that satisfies its specification....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Automatica
سال: 2022
ISSN: ['1873-2836', '0005-1098']
DOI: https://doi.org/10.1016/j.automatica.2022.110524