Spatio-Channel Attention Blocks for Cross-modal Crowd Counting

نویسندگان

چکیده

AbstractCrowd counting research has made significant advancements in real-world applications, but it remains a formidable challenge cross-modal settings. Most existing methods rely solely on the optical features of RGB images, ignoring feasibility other modalities such as thermal and depth images. The inherently differences between different diversity design choices for model architectures make crowd more challenging. In this paper, we propose Cross-modal Spatio-Channel Attention (CSCA) blocks, which can be easily integrated into any modality-specific architecture. CSCA blocks first spatially capture global functional correlations among multi-modality with less overhead through spatial-wise attention. spatial attention are subsequently refined adaptive channel-wise feature aggregation. our experiments, proposed block consistently shows performance improvement across various backbone networks, resulting state-of-the-art results RGB-T RGB-D counting.KeywordsCrowd countingCross-modalAttention

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-modal attention enhances perceived contrast.

Visual Attention Each time we open our eyes, we are confronted with an overwhelming amount of information. How is it possible, then, that we still have a strong impression that we understand what we see? Visual attention is the mechanism that turns looking into seeing, allowing us to select a certain location or aspect of a busy visual scene, and prioritize its processing. Such selection is nec...

متن کامل

Cross-modal decoupling in temporal attention.

Prior studies have repeatedly reported behavioural benefits to events occurring at attended, compared to unattended, points in time. It has been suggested that, as for spatial orienting, temporal orienting of attention spreads across sensory modalities in a synergistic fashion. However, the consequences of cross-modal temporal orienting of attention remain poorly understood. One challenge is th...

متن کامل

When cross-modal spatial attention fails.

There is now convincing evidence that an involuntary shift of spatial attention to a stimulus in one modality can affect the processing of stimuli in other modalities, but inconsistent findings across different paradigms have led to controversy. Such inconsistencies have important implications for theories of cross-modal attention. The authors investigated why orienting attention to a visual ev...

متن کامل

Cross-modal cuing and selective attention

Experiments on cuing have long provided insights into the mechanisms of selective attention. A visual cue presented in a particular location can enhance subsequent visual discriminations at that location, making them faster, or more accurate, or both. The standard interpretation of such experiments is that the cue attracts attention. Subsequent stimuli at that location are then more likely to b...

متن کامل

Depth Information Guided Crowd Counting for Complex Crowd Scenes

It is important to monitor and analyze crowd events for the sake of city safety. In an EDOF (extended depth of field) image with a crowded scene, the distribution of people is highly imbalanced. People far away from the camera look much smaller and often occlude each other heavily, while people close to the camera look larger. In such a case, it is difficult to accurately estimate the number of...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2023

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-26284-5_2