Omnidirectional Audio-Visual Talker Localization Based on Dynamic Fusion of Audio-Visual Features Using Validity and Reliability Criteria
نویسندگان
چکیده
منابع مشابه
Omnidirectional audio-visual talker localizer with dynamic feature fusion based on validity and reliability criteria
Talker localization is indispensable in video conferencing. Statistical audio-visual (AV) talker localizers that fuse AV features based on prior statistical property are ideals. However, statistical property must be estimated prior to the AV feature fusion procedure. To overcome this problem, this paper proposes a novel robust and omnidirectional AV talker localizer that dynamically fuses AV fe...
متن کاملAudio-visual Speech Recognition Using Aam-based Visual Features
As one of the techniques for robust speech recognition under noisy environments, audio-visual speech recognition (AVSR) using lip dynamic scene information together with audio information is attracting attention, and the research has made strides in recent years. However, in visual speech recognition (VSR), when a face turns sideways, the shape of the lip as viewed by the camera changes and the...
متن کاملTalker variability in audio-visual speech perception
A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance c...
متن کاملDynamic visual features for audio-visual speaker verification
The cascading appearance-based (CAB) feature extraction technique has established itself as the state of the art in extracting dynamic visual speech features for speech recognition. In this paper, we will focus on investigating the effectiveness of this technique for the related speaker verification application. By investigating the speaker verification ability of each stage of the cascade we w...
متن کاملCreation and Selection of the Visual Front End Features and the Audio-Visual Feature Fusion for Audio-Visual Speech Recognition
This contribution is about a creation and selection of the visual front end speech features. The use of the visual shape and the appearance-based visual features are described here. These visual features can be used for the visual or for the audiovisual speech recognition. Before they are used, the features have to be normalized and selected in such a way, so that the recognition rate was high ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEICE Transactions on Information and Systems
سال: 2008
ISSN: 0916-8532,1745-1361
DOI: 10.1093/ietisy/e91-d.3.598