Cross-Modal Clustering
نویسنده
چکیده
This paper presents a self-supervised algorithm for learning perceptual structures based upon correlations in different sensory modalities. The brain and cognitive sciences have gathered an enormous body of neurological and phenomenological evidence in the past half century that demonstrates the extraordinary degree of interaction between sensory modalities during the course of ordinary perception. This paper presents a new framework for creating artificial perceptual systems inspired by these findings, where the primary architectural motif is the cross-modal transmission of perceptual information to enhance each sensory channel individually. The basic hypothesis underlying this approach is that the world has regularities – natural laws tend to correlate physical properties – and biological perceptory systems have evolved to take advantage of this. They share information continually and opportunistically across seemingly disparate perceptual channels, not epiphenomenologically, but rather as a fundamental component of normal perception. It is therefore essential that their artificial counterparts be able to share information synergistically within their perceptual channels, if they are to approach degrees of biological sophistication. This paper is a preliminary step in that direction.
منابع مشابه
Cross-Modal Learning via Pairwise Constraints
In multimedia applications, the text and image components in a web document form a pairwise constraint that potentially indicates the same semantic concept. This paper studies cross-modal learning via the pairwise constraint, and aims to find the common structure hidden in different modalities. We first propose a compound regularization framework to deal with the pairwise constraint, which can ...
متن کاملCross - modal Clustering in the Acoustic - Articulatory Space
This paper explores cross-modal clustering in the acoustic-articulatory space. A method to improve clustering using information from more than one modality is presented. Formants and the Electromagnetic Articulography measurements are used to study corresponding clusters formed in the two modalities. A measure for estimating the uncertainty in correspondences between one cluster in the acoustic...
متن کاملMulti- and Cross-Modal Semantics Beyond Vision: Grounding in Auditory Perception
Multi-modal semantics has relied on feature norms or raw image data for perceptual input. In this paper we examine grounding semantic representations in raw auditory data, using standard evaluations for multi-modal semantics, including measuring conceptual similarity and relatedness. We also evaluate cross-modal mappings, through a zero-shot learning task mapping between linguistic and auditory...
متن کاملActive Speaker Detection and Localization with a Weighted-data Mixture Model
In this paper we address the problem of detecting and locating speakers using audiovisual data. We propose to address this problem in the framework of data clustering. We propose a novel cross-modal clustering method based on finite mixture models and which explores the idea of non-uniform weighting of observations. Weighted-data clustering techniques have already been proposed, but not in a ge...
متن کاملHeterogeneous Data Co-Clustering by Pseudo-Semantic Affinity Functions
The convergence between Web technology and multimedia production is enabling the distribution of content through dynamic media platforms such as RSS feeds and hybrid digital television. Heterogeneous data clustering is needed to analyse, manage and access desired information from this variety of information sources. This paper defines a new class of pseudo-semantic affinity functions that allow...
متن کاملA Cross-Modal Concept Detection and Caption Prediction Approach in ImageCLEFcaption Track of ImageCLEF 2017
This article describes the participation of the Computer Science Department of Morgan State University, Baltimore, Maryland, USA in the ImageCLEFcaption under ImageCLEF 2017. The purpose of this research and participation is to be able to predict the caption and detect UMLS concepts of an unknown query (test) image by using Cross Modal Retrieval and Clustering techniques. In our approach, for e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005