Investigations on discriminative training in large scale acoustic model estimation
نویسنده
چکیده
In this paper two common discriminative training criteria, maximum mutual information (MMI) and minimum phone error (MPE), are investigated. Two main issues are addressed: sensitivity to different lattice segmentations and the contribution of the parameter estimation method. It is noted that MMI andMPE may benefit from different lattice segmentation strategies. The use of discriminative criterion values as the measure of model goodness is shown to be problematic as the recognition results do not correlate well with these measures. Moreover, the parameter estimation method clearly affects the recognition performance of the model irrespective of the value of the discriminative criterion. Also the dependence on the recognition task is demonstrated by example with two Finnish large vocabulary dictation tasks used in the experiments.
منابع مشابه
Large Margin Training of Acoustic Models for Speech Recognition
LARGE MARGIN TRAINING OF ACOUSTIC MODELS FOR SPEECH RECOGNITION Fei Sha Advisor: Prof. Lawrence K. Saul Automatic speech recognition (ASR) depends critically on building acoustic models for linguistic units. These acoustic models usually take the form of continuous-density hidden Markov models (CD-HMMs), whose parameters are obtained by maximum likelihood estimation. Recently, however, there ha...
متن کاملInvestigations on error minimizing training criteria for discriminative training in automatic speech recognition
Discriminative training criteria have been shown to consistently outperform maximum likelihood trained speech recognition systems. In this paper we employ the Minimum Classification Error (MCE) criterion to optimize the parameters of the acoustic model of a large scale speech recognition system. The statistics for both the correct and the competing model are solely collected on word lattices wi...
متن کاملLarge-scale, sequence-discriminative, joint adaptive training for masking-based robust ASR
Recently, it was shown that the performance of supervised timefrequency masking based robust automatic speech recognition techniques can be improved by training them jointly with the acoustic model [1]. The system in [1], termed deep neural network based joint adaptive training, used fully-connected feedforward deep neural networks for estimating time-frequency masks and for acoustic modeling; ...
متن کاملDiscriminative training for complementariness in system combination
In recent years, techniques of output combination from multiple speech recognizers for improved overall performance have gained popularity. Most commonly, the combined systems are established independently. This paper describes our attempt to directly target joint system performance in the discriminative training objective of acoustic model parameter estimation. It also states first promising r...
متن کاملDiscriminative training of GMM-HMM acoustic model by RPCL learning
This paper presents a new discriminative approach for training Gaussian mixture models (GMMs) of hidden Markov models (HMMs) based acoustic model in a large vocabulary continuous speech recognition (LVCSR) system. This approach is featured by embedding a rival penalized competitive learning (RPCL) mechanism on the level of hidden Markov states. For every input, the correct identity state, calle...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009