Learning Intuitive Physics with Multimodal Generative Models

نویسندگان

چکیده

Predicting the future interaction of objects when they come into contact with their environment is key for autonomous agents to take intelligent and anticipatory actions. This paper presents a perception framework that fuses visual tactile feedback make predictions about expected motion in dynamic scenes. Visual information captures object properties such as 3D shape location, while provides critical cues forces resulting it makes environment. Utilizing novel See-Through-your-Skin (STS) sensor high resolution multimodal sensing surfaces, our system both appearance objects. We interpret dual stream signals from using Multimodal Variational Autoencoder (MVAE), allowing us capture modalities contacting develop mapping vice-versa. Additionally, perceptual can be used infer outcome physical interactions, which we validate through simulated real-world experiments resting state an predicted given initial conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Joint Multimodal Learning with Deep Generative Models

We investigate deep generative models that can exchange multiple modalities bidirectionally, e.g., generating images from corresponding texts and vice versa. Recently, some studies handle multiple modalities on deep generative models, such as variational autoencoders (VAEs). However, these models typically assume that modalities are forced to have a conditioned relation, i.e., we can only gener...

متن کامل

Semi-supervised Multimodal Learning with Deep Generative Models

In recent years, deep neural networks are used mainly as discriminators of multimodal learning. We should have large amounts of labeled data for training them, but obtaining such data is difficult because it requires much labor to label inputs. Therefore, semi-supervised learning, which improves the discriminator performance using unlabeled data, is important. Among semi-supervised learning, me...

متن کامل

Computational Models of Intuitive Physics

People have a powerful “physical intelligence” – an ability to infer physical properties of objects and predict future states in complex, dynamic scenes – which they use to interpret their surroundings, plan safe and effective actions, build and understand devices and machines, and communicate efficiently. For instance, you can choose where to place your coffee to prevent it from spilling, arra...

متن کامل

Multimodal Generative Models for Scalable Weakly-Supervised Learning

Multiple modalities often co-occur when describing natural phenomena. Learning a joint representation of these modalities should yield deeper and more useful representations. Previous work have proposed generative models to handle multimodal input. However, these models either do not learn a joint distribution or require complex additional computations to handle missing data. Here, we introduce...

متن کامل

Cross-Situational Learning with Bayesian Generative Models for Multimodal Category and Word Learning in Robots

In this paper, we propose a Bayesian generative model that can form multiple categories based on each sensory-channel and can associate words with any of the four sensory-channels (action, position, object, and color). This paper focuses on cross-situational learning using the co-occurrence between words and information of sensory-channels in complex situations rather than conventional situatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i7.16761