New Results with the Lincoln Tied-Mixture HMM CSR System

نویسنده

  • Douglas B. Paul
چکیده

The following describes recent work on the Lincoln CSR system. Some new variations in semiphone modeling have been tested. A very simple improved duration model has reduced the error rate by about 10~ in both triphone and semiphone systems. A new training strategy has been tested which, by itself, did not provide useful improvements but suggests that improvements can be obtained by a related rapid adaptation technique. Finally, the recognizer has been modified to use bigram back-off language models. The system was then transferred from the RM task to the ATIS CSR task and a limited number of development tests performed. Evaluation test results are presented for both the RM and ATIS CSR tasks. I N T R O D U C T I O N The following experiments are all carried out in the context of the Lincoln tied-mixture (TM) hidden Markov model (HMM) continuous speech recognition (CSR) system. This system uses two observation streams (TM-2) for speaker-dependent (SD) recognition: mel-cepstra and time differential mel-cepstra. For speaker-independent (SI) recognition, a second differential mel-cepstral observation stream is added (TM-3). The system uses Gaussian tied mixture [1, 2] observation pdfs and treats each observation stream as if it is statistically independent of all others. Triphone models [14], including cross-word triphone models [10, 7, 16], are used to model phonetic coarticulation. These models are smoothed with reduced context phone models [14]. Each phone model is a three state "linear" (no skip transitions) HMM. The phone models are trained by the forward-backward algorithm using an unsupervised monophone (context independent phone) bootstrapping procedure. The recognizer extrapolates (estimates) untrained phone models and recognizes using a Viterbi beam search. The initial implementation uses finite-state grammars, contains an adaptive background model, and allows optional inter-word silences. All RM1 development tests use the designated SD development test set (100 sentences x 12 speak1This work was sponsored by the Defense Advanced Research Projects Agency. ers) and all RM2 tests use the designated development test set (120 sentences x 4 speakers).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Lincoln Large-Vocabulary HMM CSR

The work described here focuses on recognition of the Wall Street Journal (WSJ) pilot database [17], a new CSR database which supports 5K, 20K, and up to 64Kword CSR tasks. The original Lincoln Tied-Mixture HMM CSR was implemented using a time-synchronous beam-pruned search of a static network[14] and does not extend well to this task because the recognition network would be too large for curre...

متن کامل

Tied Mixtures in the Lincoln Robust CSR

HMM recognizers using either a single Gaussian or a Gaussian mixture per state have been shown to work fairly well for 1000-word vocabulary continuous speech recognition. However, the large number of Gaussians required to cover the entire English language makes these systems unwieldy for large vocabulary tasks. Tied mixtures offer a more compact way of representing the observation pdf's. We hav...

متن کامل

The Lincoln Continuous Tied-Mixture HMM Speech Recognizer

The Lincoln robust HMM recognizer has been converted from a single Ganssian or Gaussian mixture pdf per state to tied mixtures in which a single set of Gaussians is shared between all states. There were some initial difficulties caused by the use of mixture pruning [12] but these were cured by using observation pruning. Fixed weight smoothing of the mixture weights allowed the use of word-bound...

متن کامل

A Tied-Mixture 2-D HMM Face Recognition System

In this paper, a simplified 2-D second-order Hidden Markov Model (HMM) with tied state mixtures is applied to the face recognition problem. The mixture of the model states is fully-tied across all models for lower complexity. Tying HMM parameters is a well-known solution for the problem of insufficient training data leading to nonrobust estimation. We show that parameter tying in HMM also enhan...

متن کامل

Tied-Posteriors: A New Hybrid Speech Recognition Technology with Generic Capabilities and High Portability

This paper presents a new method for estimating the emission probabilities of general hybrid connectionist/HMM recognition systems. Contrary to the traditional hybrid approach, where a neural network is used for providing posterior probabilities in order to model the emission probabilities of one-state HMMs, our new tiedposterior approach uses the posterior probabilities resulting from the neur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1991