Mid-level features and spatio-temporal context for activity recognition
نویسندگان
چکیده
Local spatio-temporal features have been shown to be effective and robust in order to represent simple actions. However, for high level human activities with long-range motion or multiple interactive body parts and persons, the limitation of low-level features blows up because of their localness. This paper addresses the problem by suggesting a framework that computes mid-level features and takes into account their contextual information. First, we represent human activities by a set of mid-level components, referred to as activity components, which have consistent structure and motion in spatial and temporal domain respectively. These activity components are extracted hierarchically from videos, i.e., extracting key-points, grouping them into trajectories and finally clustering trajectories into components. Second, to further exploit the interdependencies of the activity components, we introduce a spatio-temporal context kernel (STCK), which not only captures local properties of features but also considers their spatial and temporal context information. Experiments conducted on two challenging activity recognition datasets show that the proposed approach outperforms standard spatio-temporal features and our STCK context kernel improves further the performance. & 2012 Elsevier Ltd. All rights reserved.
منابع مشابه
Spatio-Temporal Context of Mid-level Features for Activity Recognition
Local spatio-temporal features have been shown to be efficient and robust to represent simple actions. However, for complicated human activities with long-range motion or multiple interactive body parts and persons, the limitation of low-level features blows up because of their local properties and the lack of context. This paper addresses the problem by suggesting a framework for both computin...
متن کاملRecognition of Visual Events using Spatio-Temporal Information of the Video Signal
Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...
متن کاملContext-aware Modeling for Spatio-temporal Data Transmitted from a Wireless Body Sensor Network
Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is con...
متن کاملExtreme Learning Machine for Large-Scale Action Recognition
In this paper, we describe the method we applied for the action recognition task on the THUMOS 2014 challenge dataset. We study human action recognition in RGB videos through low-level features by focusing on improved trajectory features that are densely extracted from the spatio-temporal volume. We represent each video with Fisher vector encoding and additional mid-level feautures. Finally, we...
متن کاملEvaluation of Tests for Separability and Symmetry of Spatio-temporal Covariance Function
In recent years, some investigations have been carried out to examine the assumptions like stationarity, symmetry and separability of spatio-temporal covariance function which would considerably simplify fitting a valid covariance model to the data by parametric and nonparametric methods. In this article, assuming a Gaussian random field, we consider the likelihood ratio separability test, a va...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 45 شماره
صفحات -
تاریخ انتشار 2012