Extracting Visual Patterns from Deep Learning Representations

نویسندگان

Dario Garcia-Gasulla

Javier Béjar

Ulises Cortés

Eduard Ayguadé

Jesús Labarta

چکیده

Vector-space word representations based on neural network models can include linguistic regularities, enabling semantic operations based on vector arithmetic. In this paper, we explore an analogous approach applied to images. We define a methodology to obtain large and sparse vectors from individual images and image classes, by using a pre-trained model of the GoogLeNet architecture. We evaluate the vector-space after processing 20,000 ImageNet images, and find it to be highly correlated with WordNet lexical distances. Further exploration of image representations shows how semantically similar elements are clustered in that space, regardless of large visual variances (e.g., 118 kinds of dogs), and how the space distinguishes abstract classes of objects without supervision (e.g., living things from non-living things). Finally, we consider vector arithmetic, and find them to be related with image concatenation (e.g., “horse cart horse ' rickshaw”), image overlap (“Panda Brown bear ' Skunk”) and regularities (“Panda is to Brown bear as Skunk is to Badger”). All these results indicate that visual semantics contain a large amount of general information, and that those semantics can be extracted as vector representations from neural network models, making them available for further learning and reasoning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Deep Representations for Scene Labeling with Semantic Context Guided Supervision

Scene labeling is a challenging classification problem where each input image requires a pixel-level prediction map. Recently, deep-learning-based methods have shown their effectiveness on solving this problem. However, we argue that the large intra-class variation provides ambiguous training information and hinders the deep models’ ability to learn more discriminative deep feature representati...

متن کامل

Learning Deep Generative Models

Building intelligent systems that are capable of extracting high-level representations from high-dimensional sensory data lies at the core of solving many artificial intelligence–related tasks, including object recognition, speech perception, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep architectures that ...

متن کامل

Towards Deep Interpretability (mus-rover Ii): Learning Hierarchical Representations of Tonal Music

Music theory studies the regularity of patterns in music to capture concepts underlying music styles and composers’ decisions. This paper continues the study of building automatic theorists (rovers) to learn and represent music concepts that lead to human interpretable knowledge and further lead to materials for educating people. Our previous work took a first step in algorithmic concept learni...

متن کامل

A Deep Learning Architecture for Image Representation, Visual Interpretability and Automated Basal-Cell Carcinoma Cancer Detection

This paper presents and evaluates a deep learning architecture for automated basal cell carcinoma cancer detection that integrates (1) image representation learning, (2) image classification and (3) result interpretability. A novel characteristic of this approach is that it extends the deep learning architecture to also include an interpretable layer that highlights the visual patterns that con...

متن کامل

Extracting Visual Knowledge from the Web with Multimodal Learning

We consider the problem of automatically extracting visual objects from web images. Despite the extraordinary advancement in deep learning, visual object detection remains a challenging task. To overcome the deficiency of pure visual techniques, we propose to make use of meta text surrounding images on the Web for enhanced detection accuracy. In this paper we present a multimodal learning algor...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1507.08818 شماره

صفحات -

تاریخ انتشار 2015

Extracting Visual Patterns from Deep Learning Representations

نویسندگان

چکیده

منابع مشابه

Learning Deep Representations for Scene Labeling with Semantic Context Guided Supervision

Learning Deep Generative Models

Towards Deep Interpretability (mus-rover Ii): Learning Hierarchical Representations of Tonal Music

A Deep Learning Architecture for Image Representation, Visual Interpretability and Automated Basal-Cell Carcinoma Cancer Detection

Extracting Visual Knowledge from the Web with Multimodal Learning

عنوان ژورنال:

اشتراک گذاری