Likelihood Ratio Based Score Fusion for Audio-Visual Speaker Identification in Challenging Environment
نویسندگان
چکیده
It is well known to enhance the performance of noise robust speaker identification using visual speech information with audio utterances. This paper presents an approach to evaluate the performance of a noise robust audio-visual speaker identification system using likelihood ratio based score fusion in challenging environment. Though the traditional HMM based audio-visual speaker identification system is very sensitive to the speech parameter variation, the proposed likelihood ratio based score fusion method is found to be stance and performs well for improving the robustness and naturalness of human-computer-interaction. In this paper, we investigate the proposed audio-visual speaker identification system in typical office environments conditions. To do this, we investigated two approaches that utilize speech utterance with visual features to improve speaker identification performance in acoustically and visually challenging environment: one seeks to eliminate the noise from the acoustic and visual
منابع مشابه
Hybrid Feature and Decision Fusion Based Audio-Visual Speaker Identification in Challenging Environment
The contribution of this paper is to propose a novel approach of evaluating the performance of a noise robust audio-visual speaker identification system in challenging environment. Though the traditional HMM based audio-visual speaker identification system is very sensitive to the speech parameter variation, the proposed hybrid feature and decision fusion based audio-visual speaker identificati...
متن کاملMulti-level Fusion of Audio and Visual Features for Speaker Identification
This paper explores the fusion of audio and visual evidences through a multi-level hybrid fusion architecture based on dynamic Bayesian network (DBN), which combines model level and decision level fusion to achieve higher performance. In model level fusion, a new audio-visual correlative model (AVCM) based on DBN is proposed, which describes both the intercorrelations and loose timing synchroni...
متن کاملA Review of Various Score Normalization Techniques for Speaker Identification System
This paper presents an overview of a state-of-the-art text-independent speaker verification system using score normalization. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Normalization of scores is then explai...
متن کاملWeight Estimation for Audio-Visual Multi-level Fusion in Bimodal Speaker Identification
This paper investigates the estimation of fusion weights under varying acoustic noise conditions for audio-visual multi-level hybrid fusion strategy in speaker identification. The multi-level fusion combines model level and decision level fusion via dynamic Bayesian networks (DBNs). A novel methodology known as support vector regression (SVR) is utilized to estimate the fusion weights directly ...
متن کاملAudio-Visual Speaker Identification via Adaptive Fusion Using Reliability Estimates of Both Modalities
An audio-visual speaker identification system is described, where the audio and visual speech modalities are fused by an automatic unsupervised process that adapts to local classifier performance, by taking into account the output score based reliability estimates of both modalities. Previously reported methods do not consider that both the audio and the visual modalities can be degraded. The v...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010