Chapter 16 JOINT AUDIO - VIDEO PROCESSING FOR ROBUST BIOMETRIC SPEAKER IDENTIFICATION IN CAR 1
نویسندگان
چکیده
In this chapter, we present our recent results on the multilevel Bayesian decision fusion scheme for multimodal audio-visual speaker identification problem. The objective is to improve the recognition performance over conventional decision fusion schemes. The proposed system decomposes the information existing in a video stream into three components: speech, lip trace and face texture. Lip trace features are extracted based on 2D-DCT transform of the successive active lip frames. The mel-frequency cepstral coefficients (MFCC) of the corresponding speech signal are extracted in parallel to the lip features. The resulting two parallel and synchronous feature vectors are used to train and test a two stream Hidden Markov Model (HMM) based identification system. Face texture images are treated separately in eigenface domain and integrated to the system through decision-fusion. Reliability based ordering in multilevel decision fusion is observed to be significantly robust at all SNR
منابع مشابه
Joint audio-video processing for biometric speaker identification
In this paper we present a bimodal audio-visual speaker identification system. The objective is to improve the recognition performance over conventional unimodal schemes. The proposed system exploits not only the temporal and spatial correlations existing in speech and video signals of a speaker, but also the crosscorrelation between these two modalities. Lip images extracted for each video fra...
متن کاملSpeaker and Speech recognition by Audio-Visual lip biometrics
This paper proposes a new robust bi-modal audio visual speech and speaker recognition system by lip-motion and speech biometrics. To increase the robustness of speech and speaker recognition, we have proposed a method using speaker lip motion information extracted from video sequences with low resolution (128 ×128 pixels). In this paper we investigate a biometric system for speech recognition a...
متن کاملRobust Iris Recognition in Unconstrained Environments
A biometric system provides automatic identification of an individual based on a unique feature or characteristic possessed by him/her. Iris recognition (IR) is known to be the most reliable and accurate biometric identification system. The iris recognition system (IRS) consists of an automatic segmentation mechanism which is based on the Hough transform (HT). This paper presents a robust IRS i...
متن کاملRobust Speaker Recognition Biometric System a Detailed Review
his paper reviews Biometric based Speaker Recognition and presents brief about various algorithms and techniques used at various stages of Speaker Recognition and development of Attendance System as application of Speaker Recognition. The research is being carried out in this area for many years. However, the accuracy of system depends upon speaker’s variability and environmental conditions. Va...
متن کاملAnalysis of i-vector framework for speaker identification in TV-shows
Inspired from the Joint Factor Analysis, the I-vector-based analysis has become the most popular and state-of-the-art framework for the speaker verification task. Mainly applied within the NIST/SRE evaluation campaigns, many studies have been proposed to improve more and more performance of speaker verification systems. Nevertheless, while the i-vector framework has been used in other speech pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010