Semantic Concept Detection Using Dense Codeword Motion

نویسندگان

  • Claudiu Tanase
  • Bernard Mérialdo
چکیده

When detecting semantic concepts in video, much of the existing research in content-based classification uses keyframe information only. Particularly the combination between local features such as SIFT and the Bag of Words model is very popular with TRECVID participants. The few existing motion and spatiotemporal descriptors are computationally heavy and become impractical when applied on large datasets such as TRECVID. In this paper, we propose a way to efficiently combine positional motion obtained from optic flow in the keyframe with information given by the Dense SIFT Bag of Words feature. The features we propose work by spatially binning motion vectors belonging to the same codeword into separate histograms describing movement direction (left, right, vertical, zero, etc.). Classifiers are mapped using the homogeneous kernel map techinque for approximating the χ2 kernel and then trained efficiently using linear SVM. By using a simple linear fusion technique we can improve the Mean Average Precision of the Bag of Words DSIFT classifier on the TRECVID 2010 Semantic Indexing benchmark from 0.0924 to 0.0972, which is confirmed to be a statistically significant increase based on standardized TRECVID randomization tests.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Motion Concept Retrieval in Non-Static Background Utilizing Spatial-Temporal Visual Information

Motion concepts mean those concepts containing motion information such as racing car and dancing. In order to achieve high retrieval accuracy comparing with those static concepts such as car or person in semantic retrieval tasks, the temporal information has to be considered. Additionally, if a video sequence is captured by an amateur using a hand-held camera containing signi ̄cant camera motion...

متن کامل

Approximability of Dense Instances of NEAREST CODEWORD Problem

We give a polynomial time approximation scheme (PTAS) for dense instances of the Nearest Codeword problem.

متن کامل

High Dense Crowd Pattern and Anomaly Detection Using Statistical Model

Human crowd behavior analysis is a subject of great interest in research now days. Great advantage of investigating dense human crowds in places like mosques and temples to perform automatic surveillance for any unusual activity detection that might be a subject of interest and must be addressed on earliest to avoid accident. We present robust statistical skeleton for modeling a dense crowded s...

متن کامل

Combining Motion Understanding and Keyframe Image Analysis for Broadcast Video Information Extraction

We describe a robust new approach to extract semantic concept information based on explicitly encoding static image appearance features together with motion information. For high-level semantic concept identification detection in broadcast video, we trained multi-modality classifiers which combine the traditional static image features and a new motion feature analysis method (MoSIFT). The exper...

متن کامل

Pooling in image representation: The visual codeword point of view

In this work, we propose BossaNova, a novel representation for contentbased concept detection in images and videos, which enriches the Bag-of-Words model. Relying on the quantization of highly discriminant local descriptors by a codebook, and the aggregation of those quantized descriptors into a single pooled feature vector, the Bag-of-Words model has emerged as the most promising approach for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013