Improvement of distant-talking speaker identification using bottleneck features of DNN
نویسندگان
چکیده
In this paper we propose bottleneck features of deep neural network for distant-talking speaker identification. The accuracy of distant-talking speaker recognition is significantly degraded under reverberant environment. Feature mapping or feature transformation has been shown efficacy in channel-mismatch speaker recognition. Bottleneck feature derived from multilayer network, which is a nonlinear feature transformation method, has been shown efficacy in automatic speech recognition (ASR) system. In this study, bottleneck features extracted from deep neural networks (DNNs) which employ an unsupervised pre-training method are used as nonlinear feature transformation for distant-talking speech. The speaker identification experiment was performed on large-scale distant-talking speech set, with reverberant environments different to the training environments. The proposed bottleneck features achieved a relative error reduction of 46.3% compared with conventional MFCC. Moreover, a combination of likelihoods of bottleneck features and MFCC achieved a furthermore improvement.
منابع مشابه
Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification
Deep neural network (DNN)-based approaches have been shown to be effective in many automatic speech recognition systems. However, few works have focused on DNNs for distant-talking speaker recognition. In this study, a bottleneck feature derived from a DNN and a cepstral domain denoising autoencoder (DAE)-based dereverberation are presented for distant-talking speaker identification, and a comb...
متن کاملFeature mapping using far-field microphones for distant speech recognition
Acoustic modeling based on deep architectures has recently gained remarkable success, with substantial improvement of speech recognition accuracy in several automatic speech recognition (ASR) tasks. For distant speech recognition, the multi-channel deep neural network based approaches rely on the powerful modeling capability of deep neural network (DNN) to learn suitable representation of dista...
متن کاملAnalysis and Optimization of Bottleneck Features for Speaker Recognition
Recently, Deep Neural Network (DNN) based bottleneck features proved to be very effective in i-vector based speaker recognition. However, the bottleneck feature extraction is usually fully optimized for speech rather than speaker recognition task. In this paper, we explore whether DNNs suboptimal for speech recognition can provide better bottleneck features for speaker recognition. We experimen...
متن کاملSpeaker adaptation of convolutional neural network using speaker specific subspace vectors of SGMM
The recent success of convolutional neural network (CNN) in speech recognition is due to its ability to capture translational variance in spectral features while performing discrimination. The CNN architecture requires correlated features as input and thus fMLLR transform which is estimated in de-correlated feature space fails to give significant improvement. In this paper, we propose two metho...
متن کاملSpeaker adaptation using the i-vector technique for bottleneck features
Deep Neural Networks (DNN) have been largely used and successfully applied in the context of speaker independent Automatic Speech Recognition (ASR). However, these models are not easily adapted to model a specific speaker characteristic. Recently, one approach was proposed to address this issue, which consists of using the I-vector representation as input to the DNN. The I-vector is playing the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013