Imitation and Transfer Learning for LQG Control
نویسندگان
چکیده
In this paper we study an imitation and transfer learning setting for Linear Quadratic Gaussian (LQG) control, where (i) the system dynamics, noise statistics cost function are unknown expert data is provided (that is, sequences of optimal inputs outputs) to learn LQG controller, (ii) multiple control tasks performed same but with different costs. We show that controller can be learned from a set trajectories length $n(l+2)-1$, $n$ $l$ dimension state output, respectively. Further, decomposed as product estimation matrix, which depends only on cost. This data-based separation principle allows us matrix across tasks, reduce needed to~$2n+m-1$ $m$ (for single-input systems $l=2$, yields approximately $50\%$ reduction required data).
منابع مشابه
Adaptive LQG Control with Loop Transfer Recovery
In this paper we propose for scalar plants an adaptive LQG controller with adaptive input sensitivity function/loop transfer recovery of an associated adaptive LQ design. The sensitivity recovery can be viewed as a frequency-shaped loop recovery where the weights involve a sensitivity function. The adaptive loop/sensitivity recovery is achieved by feeding back the estimation residuals to the co...
متن کاملInverse Optimal Heuristic Control for Imitation Learning
One common approach to imitation learning is behavioral cloning (BC), which employs straightforward supervised learning (i.e., classification) to directly map observations to controls. A second approach is inverse optimal control (IOC), which formalizes the problem of learning sequential decision-making behavior over long horizons as a problem of recovering a utility function that explains obse...
متن کاملAdversarial Inverse Optimal Control for General Imitation Learning Losses and Embodiment Transfer
We develop a general framework for inverse optimal control that distinguishes between rationalizing demonstrated behavior and imitating inductively inferred behavior. This enables learning for more general imitative evaluation measures and differences between the capabilities of the demonstrator and those of the learner (i.e., differences in embodiment). Our formulation takes the form of a zero...
متن کاملLearning Algorithm for LQG Model With Constrained Control
The paper considers a discrete-time linear quadratic Gaussian model with constrained control. It is formulated with Markov systems. With the derivative equation, a performance gradient with respect to control parameters is estimated from a sample path. Then a learning algorithm is proposed to obtain a suboptimal feedback policy in affine linear form. The learning algorithm can be implemented on...
متن کاملLQG Online Learning
Optimal control theory and machine learning techniques are combined to formulate and solve in closed form an optimal control formulation of online learning from supervised examples with regularization of the updates. The connections with the classical linear quadratic gaussian (LQG) optimal control problem, of which the proposed learning paradigm is a nontrivial variation as it involves random ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Control Systems Letters
سال: 2023
ISSN: ['2475-1456']
DOI: https://doi.org/10.1109/lcsys.2023.3285167