Vocabulary and Environment Adaptation in Vocabulary-Independent Speech Recognition

نویسندگان

  • Hsiao-Wuen Hon
  • Kai-Fu Lee
چکیده

In this paper, we are looking into the adaptation issues of vocabulary-independent (VI) systems. Just as with speakeradaptation in speaker-independent system, two vocabulary adaptation algorithms [5] are implemented in order to tailor the VI subword models to the target vocabulary. The first algorithm is to generate vocabulary-adapted clustering decision trees by focusing on relevant allophones during tree generation and reduces the VI error rate by 9%. The second algorithm, vocabulary-bias training, is to give the relevant allophones more prominence by assign more weight to them during Baum-Welch training of the generalized allophonic models and reduces the VI error rate by 15%. Finally, in order to overcome the degradation caused by the different acoustic environments used for VI training and testing, CDCN and ISDCN originally designed for microphone adaptation are incorporated into our VI system and both reduce the degradation of VI cross-environment recognition by 50%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MLLR method for Environmental Adaptation in a Continuous Farsi Speech Recognition

In this paper, MLLR adaptation of continuous density HMM is investigated in a Farsi speaker independent large vocabulary continuous speech recognition system in attempt to improve recognition rate in real world situations. In the MLLR framework, we have experienced the use of Gaussian mean transformations in global adaptation and regression tree based adaptation. Besides full and block-diagonal...

متن کامل

Very fast adaptation for large vocabulary continuous speech recognition using eigenvoices

The principle of the eigenvoice method | using a priori knowledge on the speaker variability as collected during the training for a very fast adaptation | is applied to continuous speech recognition with large vocabulary. The handling of mixture density HMMmodels is discussed. For the case of gender independent models, a decrease of the word error rate of up to 15% is observed for unsupervised ...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Speaker Adaptation Using Projection to Latent Structure Algorithm

Correlation between observations of different states is an important apriori information reflecting speech characteristics, which is a key factor improving speech recognition system robustness. Since speech and noise are statistically independent, correlation information can be used to reduce noise effect on speech recognition performance in noisy environment. This paper proposed a new speaker ...

متن کامل

Towards non-stationary model-based noise adaptation for large vocabulary speech recognition

Recognition rates of speech recognition systems are known to degrade substantially when there is a mismatch between training and deployment environments. One approach to tackling this problem is to transform the acoustic models based on the channel distortion and noise characteristics of the new environment. Currently, most model adaptation strategies assume that the noise characteristics are s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992