Exploring Locally Rigid Discriminative Patches for Learning Relative Attributes

نویسندگان

  • Yashaswi Verma
  • C. V. Jawahar
چکیده

Relative attributes help in comparing two images based on their visual properties [4]. These are of great interest as they have been shown to be useful in several vision related problems such as recognition, retrieval, and understanding image collections in general. In the recent past, quite a few techniques (such as [3, 4, 5, 6]) have been proposed for the relative attribute learning task that give reasonable performance. However, these have focused either on the algorithmic aspect or the representational aspect. In this work, we revisit these approaches and integrate their broader ideas to develop simple baselines. These not only take care of the algorithmic aspects, but also take a step towards analyzing a simple yet domain independent patch-based representation [1] for this task. Given an image, we compute HOG descriptors from non-overlapping square patches and concatenate them. This basic representation efficiently captures local shape in an image, as well as spatially rigid correspondences across regions in an image pair. The motivation behind using this for the relative attribute learning task is the observation that images in several domain-specific datasets (such as shoes and faces) are largely aligned, and spatial variations in the regions of interest are globally minimal (Figure 2). We integrate this representation with two state-of-the-art approaches: (i) “Global” [4] that learns a single, globally trained ranking model (Ranking SVM [2]) for each attribute, and (ii) “LocalPair” [6] that uses a ranking model trained locally using analogous training pairs for each test pair. Its another variant, “LocalPair+ML”, uses a learned distance metric while computing the analogous pairs. The motivation behind the LocalPair approach is that as visual differences within an image-pair become more and more subtle, a single prediction model trained using the whole dataset may become inaccurate. This is because it captures only the coarse details, and smoothens the fine-grained properties. This approach proposes to consider only the few training pairs for each test pair that are most analogous to it. These can be thought of as the K training pairs that are most similar to the given test pair. In LocalPair+ML, a learned distance metric is used to give more importance to those feature dimensions that are more representative of a particular attribute while computing the analogous pairs. Using the identified pairs, both LocalPair and LocalPair+ML learn a local (specific to the given test pair) ranking model similar to [4]. Note that the “Global” approach can be thought of as a special case of the LocalPair approach where K is the total number of training pairs, and thus all of them are considered while learning a ranking model. This is illustrated in Figure 1. We refer the above baselines as Global+Hog, LocalPair+Hog and LocalPair+ML+Hog. These baselines are extensively evaluated on three challenging relative attribute datasets: OSR (natural outdoor scenes), LFW-10 (faces) and UT-Zap50K (shoes). While comparing with previous works, we use the representations used by them (wherever applicable). Table 1 summarizes the quantitative results. We can observe that the baselines achieve promising results on the OSR and LFW-10 datasets, and perform better than the current state-of-the-art on the UT-Zap50K dataset (note that UT-Zap50K-2 dataset with fine-grained within-pair visual differences is the most challenging among these datasets). For detailed comparisons, please refer to the paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward semantic attributes in dictionary learning and non-negative matrix factorization

Binary label information is widely used semantic information in discriminative dictionary learning and non-negative matrix factorization. A Discriminative Dictionary Learning (DDL) algorithm uses the label of some data samples to enhance the discriminative property of sparse signals. A discriminative Non-negative Matrix Factorization (NMF) utilizes label information in learning discriminative b...

متن کامل

Unsupervised Learning of Discriminative Relative Visual Attributes

Unsupervised learning of relative visual attributes is important because it is often infeasible for a human annotator to predefine and manually label all the relative attributes in large datasets. We propose a method for learning relative visual attributes given a set of images for each training class. The method is unsupervised in the sense that it does not require a set of predefined attribut...

متن کامل

DeepCAMP: Deep Convolutional Action&Attribute Mid-Level Patterns

The recognition of human actions and the determination of human attributes are two tasks that call for fine-grained classification. Indeed, often rather small and inconspicuous objects and features have to be detected to tell their classes apart. In order to deal with this challenge, we propose a novel convolutional neural network that mines mid-level image patches that are sufficiently dedicat...

متن کامل

Relative Attributes for Enhanced Human-Machine Communication

We propose to model relative attributes that capture the relationships between images and objects in terms of human-nameable visual properties. For example, the models can capture that animal A is ‘furrier’ than animal B, or image X is ‘brighter’ than image B. Given training data stating how object/scene categories relate according to different attributes, we learn a ranking function per attrib...

متن کامل

Object Recognition via Local Patch Labelling

In recent years the problem of object recognition has received considerable attention from both the machine learning and computer vision communities. The key challenge of this problem is to be able to recognize any member of a category of objects in spite of wide variations in visual appearance due to variations in the form and colour of the object, occlusions, geometrical transformations (such...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015