M3IL: Multi-Modal Meta-Imitation Learning
نویسندگان
چکیده
Imitation Learning (IL) is anticipated to achieve intelligent robots since it allows the user teach various robot tasks easily.In particular, Few-Shot (FSIL) aims infer and adapt fast unseen with a small amount of data. Though FSIL requires few-shot data, high cost demonstrations in IL still critical problem. Especially when we want new task, need execute task for assignment every time. Inspired by fact that humans specify using language instructions without executing them, propose multi-modal setting this work. The model leverages image information training phase utilizes both or only testing phase. We also Multi-Modal Meta-Imitation M3IL, which can information. result M3IL outperforms baseline standard proposed settings. Our shows effectiveness importance setting.
منابع مشابه
IntentionGAN: Multi-Modal Imitation Learning from Unstructured Demonstrations
Traditionally, imitation learning has focused on using isolated demonstrations of a particular skill [3]. The demonstration is usually provided in the form of kinesthetic teaching, which requires the user to spend sufficient time to provide the right training data. This constrained setup for imitation learning is difficult to scale to real world scenarios, where robots have to be able to execut...
متن کاملBurn-In Demonstrations for Multi-Modal Imitation Learning
Recent work on imitation learning has generated policies that reproduce expert behavior from multi-modal data. However, past approaches have focused only on recreating a small number of distinct, expert maneuvers, or have relied on supervised learning techniques that produce unstable policies. This work extends InfoGAIL, an algorithm for multi-modal imitation learning, to reproduce behavior ove...
متن کاملCoordinated Multi-Agent Imitation Learning
We study the problem of imitation learning from demonstrations of multiple coordinating agents. One key challenge in this setting is that learning a good model of coordination can be difficult, since coordination is often implicit in the demonstrations and must be inferred as a latent variable. We propose a joint approach that simultaneously learns a latent coordination model along with the ind...
متن کاملLearning Multi-modal Similarity
In many applications involving multi-media data, the definition of similarity between items is integral to several key tasks, e.g., nearest-neighbor retrieval, classification, and recommendation. Data in such regimes typically exhibits multiple modalities, such as acoustic and visual content of video. Integrating such heterogeneous data to form a holistic similarity space is therefore a key cha...
متن کاملSituated robot learning for multi-modal instruction and imitation of grasping
A key prerequisite to make user instruction of work tasks by interactive demonstration effective and convenient is situated multi-modal interaction aiming at an enhancement of robot learning beyond simple low-level skill acquisition. We report the status of the Bielefeld GRAVIS-robot system that combines visual attention and gestural instruction with an intelligent interface for speech recognit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions of The Japanese Society for Artificial Intelligence
سال: 2023
ISSN: ['1346-0714', '1346-8030']
DOI: https://doi.org/10.1527/tjsai.38-2_a-lb3