Affect in Multimedia: Benchmarking Violent Scenes Detection

نویسندگان

چکیده

In this article, we report on the creation of a publicly available, common evaluation framework for Violent Scenes Detection (VSD) in Hollywood and YouTube videos. We propose robust data set, VSD96, with more than 96 hours video various genres, annotations at different levels detail (e.g., shot-level, segment-level), mid-level concepts blood, fire), pre-computed multi-modal descriptors, over 230 system output results as baselines. This is most comprehensive set available to date tailored VSD task was extensively validated during MediaEval benchmarking campaigns. Furthermore, provide an in-depth analysis crucial components algorithms, by reviewing capabilities evolution existing systems overall trends outliers, influence employed features fusion techniques, deep learning approaches). Finally, discuss possibility going beyond state-of-the-art performance via ad-hoc late approach. Experimentation carried out VSD96 data. important lessons learned gained insights. The increasing number publications using underline importance topic. presented published resources are practitioner's guide also strong baseline overcome, which will help researchers coming years analyzing aspects audio-visual affect violence detection movies

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NII-UIT at MediaEval 2014 Violent Scenes Detection Affect Task

Violent scene detection (VSD) is a challenging problem because of the heterogeneous content, large variations in video quality, and semantic meaning of the concepts. The Violent Scenes Detection Task of MediaEval [1] provides a common dataset and evaluation protocol thus enables a fair comparison of methods. In this paper, we describe our VSD system used in MediaEval 2014 and briefly discuss th...

متن کامل

ViVoLab and CVLab - MediaEval 2014: Violent Scenes Detection Affect Task

This paper describes the ViVoLab/CVLab system to provide segments of violent scenes from Hollywood movies and “wild-user” videos from the internet. We propose a system based on a fusion of acoustic features, audio concepts and video features. Our joint audio-visual approach achieves MAP2014 values of 17.81% and 43.03%, for the main task and the generalization task, respectively.

متن کامل

The MediaEval 2014 Affect Task: Violent Scenes Detection

This paper provides a description of the MediaEval 2014 Affect Task: Violent Scenes Detection, which is running for the fourth year. The task originates from a use case at Technicolorthat aims to help users find suitable contents from a movie database. We provide insights on the use case, task challenges, data set and ground truth, required and optional participant runs and evaluation metrics.

متن کامل

NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task

We present a comprehensive evaluation of performance of shot-based visual feature representations for MediaEval 2012 Violent Scenes Detection Affect Task. In spite of using keyframe-based as last year, we try to apply shot-based features using the global features (color moments, color histogram, edge orientation histogram, and local binary patterns) for violent scenes detection. Besides that, w...

متن کامل

NII-UIT at MediaEval 2013 Violent Scenes Detection Affect Task

We present a comprehensive evaluation of shot-based visual and audio features for MediaEval 2013 Violent Scenes Detection Affect Task. To obtain visual features, we use global features, local SIFT features and motion features. For audio features, the popular MFCC is employed. Besides that, we also evaluate the performance of mid-level features which is constructed using visual concepts. We comb...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Affective Computing

سال: 2022

ISSN: ['1949-3045', '2371-9850']

DOI: https://doi.org/10.1109/taffc.2020.2986969