Rank Pooling for Action Recognition
نویسندگان
چکیده
منابع مشابه
Attentional Pooling for Action Recognition
We introduce a simple yet surprisingly powerful model to incorporate attention in action recognition and human object interaction tasks. Our proposed attention module can be trained with or without extra supervision, and gives a sizable boost in accuracy while keeping the network size and computational cost nearly the same. It leads to significant improvements over state of the art base archite...
متن کاملAdaptive Structured Pooling for Action Recognition
where s ∈ S k and Ψs(p) = 1 if p ∈ s and Ψs(p) = 0 otherwise. M t k is L1-normalized and square-rooted. For a video of T frames: Mk(x, y, t) = { M k (x, y) . . .M T k (x, y) } For each feature xm ∈ X , with (xxm , yxm , txm) as spatiotemporal coordinates of its centroid, weight w m as a local integral of the pooling map Mk: w m = ∫ xxm+vx xxm−vx ∫ yxm+vy yxm−vy ∫ txm+vt txm−vt Mk(x, y, t) dx dy...
متن کاملEigen Evolution Pooling for Human Action Recognition
We introduce Eigen Evolution Pooling, an efficient method to aggregate a sequence of feature vectors. Eigen evolution pooling is designed to produce compact feature representations for a sequence of feature vectors, while maximally preserving as much information about the sequence as possible, especially the temporal evolution of the features over time. Eigen evolution pooling is a general pool...
متن کاملSecond-order Temporal Pooling for Action Recognition
Most successful deep learning models for action recognition generate predictions for short video clips, which are later aggregated into a longer time-frame action descriptor by computing a statistic over these predictions. Zeroth (max) or first order (average) statistic are commonly used. In this paper, we explore the benefits of using second-order statistics. Specifically, we propose a novel e...
متن کاملTemporal Pyramid Pooling Based Convolutional Neural Networks for Action Recognition
Encouraged by the success of Convolutional Neural Networks (CNNs) in image classification, recently much effort is spent on applying CNNs to video based action recognition problems. One challenge is that video contains a varying number of frames which is incompatible to the standard input format of CNNs. Existing methods handle this issue either by directly sampling a fixed number of frames or ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Pattern Analysis and Machine Intelligence
سال: 2017
ISSN: 0162-8828,2160-9292,1939-3539
DOI: 10.1109/tpami.2016.2558148