Deep Learning Generic Features for Cross-Media Retrieval

نویسندگان

  • Xindi Shang
  • Hanwang Zhang
  • Tat-Seng Chua
چکیده

Cross-media retrieval is an imperative approach to handle the explosive growth of multimodal data on the web. However, how to effectively uncover the correlations between multimodal data has been a barrier to successful retrieval of cross-media data. The traditional approaches learn the connection between multiple modalities by direct utilization of hand-crafted low-level heterogeneous features and the learned correlation are merely constructed in terms of high-level feature representation. To well exploit the intrinsic structures of multimodal data, it is essential to build up an interpretable correlation between multimodal data. In this paper, we propose a deep model to learn the highlevel feature representation shared by multiple modalities for cross-media retrieval. We learn the discriminative high-level feature representation in a data-driven manner before faithfully encoding the multimodal correlations. We use the large-scale multimodal data crawled from Internet to train our deep model and evaluate its effectiveness on cross-media retrieval based on NUS-WIDE dataset. The experimental results show that the proposed model outperforms other state-of-the-arts approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning a Semantic Space by Deep Network for Cross-media Retrieval

With the growth of multimedia data, the problem of cross-media (or cross-modal) retrieval has attracted considerable interest in the cross-media retrieval community. One of the solutions is to learn a common representation for multimedia data. In this paper, we propose a simple but effective deep learning method to address the cross-media retrieval problem between images and text documents for ...

متن کامل

Cross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks

Inspired by the progress of deep neural network (DNN) in single-media retrieval, the researchers have applied the DNN to cross-media retrieval. These methods are mainly two-stage learning: the first stage is to generate the separate representation for each media type, and the existing methods only model the intra-media information but ignore the inter-media correlation with the rich complementa...

متن کامل

Word2VisualVec: Cross-Media Retrieval by Visual Feature Prediction

This paper attacks the challenging problem of cross-media retrieval. That is, given an image find the text best describing its content, or the other way around. Different from existing works, which either rely on a joint space, or a text space, we propose to perform cross-media retrieval in a visual space only. We contribute Word2VisualVec, a deep neural network architecture that learns to pred...

متن کامل

A Machine Learning based Music Retrieval and Recommendation System

In this paper, we present a music retrieval and recommendation system using machine learning techniques. We propose a query by humming system for music retrieval that uses deep neural networks for note transcription and a note-based retrieval system for retrieving the correct song from the database. We evaluate our query by humming system using the standard MIREX QBSH dataset. We also propose a...

متن کامل

A Modified Grasshopper Optimization Algorithm Combined with CNN for Content Based Image Retrieval

Nowadays, with huge progress in digital imaging, new image processing methods are needed to manage digital images stored on disks. Image retrieval has been one of the most challengeable fields in digital image processing which means searching in a big database in order to represent similar images to the query image. Although many efficient researches have been performed for this topic so far, t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016