Synthesizing safe policies under probabilistic constraints with reinforcement learning and Bayesian model checking

نویسندگان

چکیده

We propose to leverage epistemic uncertainty about constraint satisfaction of a reinforcement learner in safety critical domains. introduce framework for specification requirements learners constrained settings, including confidence results. show that an agent's provides useful signal balancing optimization and the learning process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synthesizing Neural Network Controllers with Probabilistic Model based Reinforcement Learning

We present an algorithm for rapidly learning controllers for robotics systems. The algorithm follows the model-based reinforcement learning paradigm, and improves upon existing algorithms; namely Probabilistic learning in Control (PILCO) and a sample-based version of PILCO with neural network dynamics (Deep-PILCO). We propose training a neural network dynamics model using variational dropout wi...

متن کامل

Consistency Checking and Querying in Probabilistic Databases under Integrity Constraints

We address the issue of incorporating a particular yet expressive form of integrity constraints (namely, denial constraints) into probabilistic databases. To this aim, we move away from the common way of giving semantics to probabilistic databases, which relies on considering a unique interpretation of the data, and address two fundamental problems: consistency checking and query evaluation. Th...

متن کامل

Reinforcement Learning under Model Mismatch

We study reinforcement learning under model misspecification, where we do not have access to the true environment but only to a reasonably close approximation to it. We address this problem by extending the framework of robust MDPs of [2, 17, 13] to themodel-free Reinforcement Learning setting, where we do not have access to the model parameters, but can only sample states from it. We define ro...

متن کامل

Learning probabilistic classifiers under computational resource constraints

In many online applications of machine learning, the computational resources available will vary from time-to-time. Surprisingly, existing techniques are designed to accommodate the minimum expected resources, and fail to utilize further resources when they are available. This paper presents an analysis of the relevant categories of computational resource involved, and presents an algorithm tha...

متن کامل

Safe Model-based Reinforcement Learning with Stability Guarantees

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorith...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Science of Computer Programming

سال: 2021

ISSN: ['1872-7964', '0167-6423']

DOI: https://doi.org/10.1016/j.scico.2021.102620