Objectivization of Audio- Visual Correlation Analysis
نویسندگان
چکیده
منابع مشابه
Maximising audio-visual speech correlation
The aim of this work is to investigate a selection of audio and visual speech features with the aim of finding pairs that maximise audio-visual correlation. Two audio speech features have been used in the analysis filterbank vectors and the first four formant frequencies. Similarly, three visual features have also been considered active appearance model (AAM), 2-D DCT and cross-DCT. From a data...
متن کاملAudio-Visual Synchronization and Fusion using Canonical Correlation Analysis
It is well-known that early integration (also called data fusion) is effective when the modalities are correlated, and late integration (also called decision or opinion fusion) is optimal when modalities are uncorrelated. In this paper, we propose a new multimodal fusion strategy for open-set speaker identification using a combination of early and late integration following canonical correlatio...
متن کاملAnalysis of correlation between audio and visual speech features for clean audio feature prediction in noise
The aim of this work is to examine the correlation between audio and visual speech features. The motivation is to find visual features that can provide clean audio feature estimates which can be used for speech enhancement when the original audio signal is corrupted by noise. Two audio features (MFCCs and formants) and three visual features (active appearance model, 2-D DCT and cross-DCT) are c...
متن کاملAnalysis of Correlation between Audio and Audio Feature Predic
The aim of this work is to examine the correlation between audio and visual speech features. The motivation is to find visual features that can provide clean audio feature estimates which can be used for speech enhancement when the original audio signal is corrupted by noise. Two audio features (MFCCs and formants) and three visual features (active appearance model, 2-D DCT and cross-DCT) are c...
متن کاملAutomated Audio-Visual Activity Analysis
Current computer vision techniques can effectively monitor gross activities in sparse environments. Unfortunately, visual stimulus is often not sufficient for reliably discriminating between many types of activity. In many cases where the visual information required for a particular task is extremely subtle or non-existent, there is often audio stimulus that is extremely salient for a particula...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Archives of Acoustics
سال: 2012
ISSN: 0137-5075
DOI: 10.2478/v10168-012-0009-4