GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion

نویسندگان

  • Ankit Gandhi
  • Arjun Sharma
  • Arijit Biswas
  • Om Deshmukh
چکیده

Data generated from real world events are usually temporal and contain multimodal information such as audio, visual, depth, sensor etc. which are required to be intelligently combined for classification tasks. In this paper, we propose a novel generalized deep neural network architecture where temporal streams from multiple modalities are combined. There are total M+1 (M is the number of modalities) components in the proposed network. The first component is a novel temporally hybrid Recurrent Neural Network (RNN) that exploits the complimentary nature of the multimodal temporal information by allowing the network to learn both modality specific temporal dynamics as well as the dynamics in a multimodal feature space. M additional components are added to the network which extract discriminative but non-temporal cues from each modality. Finally, the predictions from all of these components are linearly combined using a set of automatically learned weights. We perform exhaustive experiments on three different datasets spanning four modalities. The proposed network is relatively 3.5%, 5.7% and 2% better than the best performing temporal multimodal baseline for UCF-101, CCV and Multimodal Gesture datasets respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification

Videos are inherently multimodal. This paper studies the problem of how to fully exploit the abundant multimodal clues for improved video categorization. We introduce a hybrid deep learning framework that integrates useful clues from multiple modalities, including static spatial appearance information, motion patterns within a short time window, audio information as well as long-range temporal ...

متن کامل

Egocentric video description based on temporally-linked sequences

Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper...

متن کامل

Numerical solution of fuzzy differential equations under generalized differentiability by fuzzy neural network

In this paper, we interpret a fuzzy differential equation by using the strongly generalized differentiability concept. Utilizing the Generalized characterization Theorem. Then a novel hybrid method based on learning algorithm of fuzzy neural network for the solution of differential equation with fuzzy initial value is presented. Here neural network is considered as a part of large eld called ne...

متن کامل

Uncertainty Measurement for Ultrasonic Sensor Fusion Using Generalized Aggregated Uncertainty Measure 1

In this paper, target differentiation based on pattern of data which are obtained by a set of two ultrasonic sensors is considered. A neural network based target classifier is applied to these data to categorize the data of each sensor. Then the results are fused together by Dempster–Shafer theory (DST) and Dezert–Smarandache theory (DSmT) to make final decision. The Generalized Aggregated Unce...

متن کامل

Multimodal Gesture Recognition Using Multi-stream Recurrent Neural Network

In this paper, we present a novel method for multimodal gesture recognition based on neural networks. Our multi-stream recurrent neural network (MRNN) is a completely data-driven model that can be trained from end to end without domain-specific hand engineering. The MRNN extends recurrent neural networks with Long Short-Term Memory cells (LSTM-RNNs) that facilitate the handling of variable-leng...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016