Synthesizing safe policies under probabilistic constraints with reinforcement learning and Bayesian model checking
نویسندگان
چکیده
We propose to leverage epistemic uncertainty about constraint satisfaction of a reinforcement learner in safety critical domains. introduce framework for specification requirements learners constrained settings, including confidence results. show that an agent's provides useful signal balancing optimization and the learning process.
منابع مشابه
Synthesizing Neural Network Controllers with Probabilistic Model based Reinforcement Learning
We present an algorithm for rapidly learning controllers for robotics systems. The algorithm follows the model-based reinforcement learning paradigm, and improves upon existing algorithms; namely Probabilistic learning in Control (PILCO) and a sample-based version of PILCO with neural network dynamics (Deep-PILCO). We propose training a neural network dynamics model using variational dropout wi...
متن کاملConsistency Checking and Querying in Probabilistic Databases under Integrity Constraints
We address the issue of incorporating a particular yet expressive form of integrity constraints (namely, denial constraints) into probabilistic databases. To this aim, we move away from the common way of giving semantics to probabilistic databases, which relies on considering a unique interpretation of the data, and address two fundamental problems: consistency checking and query evaluation. Th...
متن کاملReinforcement Learning under Model Mismatch
We study reinforcement learning under model misspecification, where we do not have access to the true environment but only to a reasonably close approximation to it. We address this problem by extending the framework of robust MDPs of [2, 17, 13] to themodel-free Reinforcement Learning setting, where we do not have access to the model parameters, but can only sample states from it. We define ro...
متن کاملLearning probabilistic classifiers under computational resource constraints
In many online applications of machine learning, the computational resources available will vary from time-to-time. Surprisingly, existing techniques are designed to accommodate the minimum expected resources, and fail to utilize further resources when they are available. This paper presents an analysis of the relevant categories of computational resource involved, and presents an algorithm tha...
متن کاملSafe Model-based Reinforcement Learning with Stability Guarantees
Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorith...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Science of Computer Programming
سال: 2021
ISSN: ['1872-7964', '0167-6423']
DOI: https://doi.org/10.1016/j.scico.2021.102620