PhysFormer++: Facial Video-Based Physiological Measurement with SlowFast Temporal Difference Transformer

نویسندگان

چکیده

Abstract Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e.g., remote healthcare affective computing). Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited spatio-temporal receptive fields, neglect the long-range perception interaction for modeling. In this paper, we propose two end-to-end transformer based architectures, namely PhysFormer PhysFormer++, to adaptively aggregate both local global features representation enhancement. As key modules PhysFormer, temporal difference transformers first enhance quasi-periodic guided attention, then refine against interference. To better exploit contextual periodic clues, also extend two-pathway SlowFast PhysFormer++ cross-attention transformers. Furthermore, label distribution a curriculum inspired dynamic constraint frequency domain, provide elaborate supervisions alleviate overfitting. Comprehensive experiments are performed four benchmark datasets show our superior performance intra- cross-dataset testings. Unlike most needed pretraining large-scale datasets, proposed family can be easily trained scratch makes it promising as novel baseline community.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Video sequence matching based on temporal ordinal measurement

This paper proposes a novel video sequence matching method based on temporal ordinal measurements. Each frame is divided into a grid and corresponding grids along a time series are sorted in an ordinal ranking sequence, which gives a global and local description of temporal variation. A video sequence matching means not only finding which video a query belongs to, but also a precise temporal lo...

متن کامل

A Non-invasive Facial Visual-Infrared Stereo Vision Based Measurement as an Alternative for Physiological Measurement

Our main aim is to propose a vision-based measurement as an alternative to physiological measurement for recognizing mental stress. The development of this emotion recognition system involved three stages: experimental setup for vision and physiological sensing, facial feature extraction in visual-thermal domain, mental stress stimulus experiment and data analysis and classification based on Su...

متن کامل

Facial expression recognition from video sequences: temporal and static modeling

The most expressive way humans display emotions is through facial expressions. In this work we report on several advances we have made in building a system for classification of facial expressions from continuous video input. We introduce and test different Bayesian network classifiers for classifying expressions from video, focusing on changes in distribution assumptions, and feature dependenc...

متن کامل

Facial Expression Recognition from Video Sequences: Temporal and Static Modelling

Human-computer intelligent interaction (HCII) is an emerging field of science aimed at providing natural ways for humans to use computers as aids. It is argued that for the computer to be able to interact with humans, it needs to have the communication skills of humans. One of these skills is the ability to understand the emotional state of the person. The most expressive way humans display emo...

متن کامل

A New Wavelet Based Spatio-temporal Method for Magnification of Subtle Motions in Video

Video magnification is a computational procedure to reveal subtle variations during video frames that are invisible to the naked eye. A new spatio-temporal method which makes use of connectivity based mapping of the wavelet sub-bands is introduced here for exaggerating of small motions during video frames. In this method, firstly the wavelet transformed frames are mapped to connectivity space a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Computer Vision

سال: 2023

ISSN: ['0920-5691', '1573-1405']

DOI: https://doi.org/10.1007/s11263-023-01758-1