Robust Bimodal Person Identification Using Face and Speech with Limited Training Data and Corruption of Both Modalities

نویسندگان

  • Niall McLaughlin
  • Ji Ming
  • Danny Crookes
چکیده

This paper presents a novel method of audio-visual fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there is a limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new representation and a modified cosine similarity are introduced for combining and comparing bimodal features with limited training data as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal data set created from the SPIDRE and AR databases with variable noise corruption of speech and occlusion in the face images. The new method has demonstrated improved recognition accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computer Vision Architecture using Fusion Technique

Humans want to communicate with the computers in the same way as they communicate with other humans. Speech is the most natural and spontaneous form of communication. Speech is bimodal in nature and it combines audio and visual information to enhance speech recognition rate especially under poor audio conditions. This paper proposes novel computer vision architecture using fusion technique. Thi...

متن کامل

Consequences of elder abuse: A qualitative study

Introduction: The consequences of misconduct in societies are not well understood and vary depending on the social and cultural context of each country. Identifying the consequences of abuse from the point of view of elderly people who are abused can help the community in particular. This study is part of an extensive qualitative study aimed at explaining the consequences of abuse in the elderl...

متن کامل

Multimodal Person Identification in a Smart Room

In this paper we present a person identification system based on a combination of acoustic features and 2D face images. We address the modality integration issue on the example of a smart room environment. In order to improve the results of the individual modalities, the audio and video classifiers are integrated after a set of normalization and fusion techniques. First we introduce the monomod...

متن کامل

Effects of In-Person and Distance Exercise Training on Outcomes of Knee Injury and Osteoarthritis among Elderly Individuals with Limited Literacy

Background: Osteoarthritis is a common chronic disease of the musculoskeletal system in older adults. Aim: This study aimed to compare the effects of in-person and distance exercise training on the outcomes of knee injury and osteoarthritis among the elderly with limited literacy. Method: In this two-group randomized clinical trial with a pretest-posttest design, 60 elderly patients with knee i...

متن کامل

Comparison of the effectiveness of muscle relaxation training in both face-to-face and distance methods on sleep quality in pregnant women with hypothyroidism

Background: One of the methods to prevent hypothyroidism is the use of non-pharmacological methods such as muscle relaxation, which can be taught in person or in absentia. The aim of this study was to compare the effectiveness of muscle relaxation training in both face-to-face and distance methods on sleep quality in pregnant women with hypothyroidism. Materials and Methods: This quasi-experim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011